Saemi runs your questions through Databricks SQL warehouses using a Personal Access Token (PAT). The connection takes about two minutes.
Before you start
You need:
- A Databricks workspace URL (e.g.
https://dbc-12345678-abcd.cloud.databricks.com) - A running SQL warehouse (Saemi runs queries through it — Serverless is fine, classic warehouses work too)
- Permission to create a Personal Access Token
Step 1 — Create a Personal Access Token
- In Databricks, click your profile avatar (top right) → User Settings
- Open the Developer tab → Access tokens → Manage
- Click Generate new token
- Set a comment ("Saemi production"), lifetime (90 days is reasonable — Saemi will warn you on expiry)
- Copy the
dapi…token — Databricks shows it only once
Step 2 — Find your warehouse HTTP path
- SQL Warehouses (left nav) → click your warehouse
- Connection details tab
- Copy the HTTP Path (e.g.
/sql/1.0/warehouses/abc123def456)
Step 3 — Connect in Saemi
- Open
/accounts/connector/in Saemi - Click Databricks → Connect
- Paste:
- Workspace URL — without trailing slash
- HTTP Path — from step 2
- Personal Access Token — from step 1 - (Optional) Catalog scope — restrict Saemi to specific catalogs (e.g.
main, gold). Leaves other catalogs invisible. Recommended for teams with PII catalogs they don't want the LLM to touch. - Click Test connection — Saemi runs
SELECT 1and reports back - Click Save
After save, Saemi scans the schema (about 30-90 seconds) and builds an ERD of your top tables. You'll see it in your new chat home.
Permissions checklist
For the PAT to work, your Databricks user needs:
CAN_USEon the SQL warehouseSELECTon the catalogs/schemas you want Saemi to read- (Optional)
USE_CATALOGon each catalog if Unity Catalog is enabled
Saemi only runs read queries — it never issues INSERT, UPDATE, DELETE, or DDL against your warehouse.
Troubleshooting
"Connection refused" — check the workspace URL (HTTPS, no trailing slash) and that the SQL warehouse is running (not Auto-Stopped).
"Permission denied on schema X" — your PAT user lacks SELECT on that schema. Either grant it or narrow the catalog scope.
"Token expired" — Databricks tokens have a max lifetime. Regenerate and re-save in Saemi.
What Saemi sees
Saemi reads:
- Table and column names (for autocomplete + ERD)
- Row counts (to suggest sampling strategies)
- Query results (for the answers it returns)
Saemi does NOT store your data — query results live only in the chat row, encrypted at rest, and are deleted when you delete the chat.