This repository uses three different configuration systems depending on your workflow.
config.py + .envUsed by: Local development and testing Purpose: Quick iteration with local Python environment
Setup:
# 1. Copy template
cp .env.example .env
# 2. Edit .env with your credentials
# DATABRICKS_HOST=https://your-workspace.cloud.databricks.com
# DATABRICKS_TOKEN=your-token
# CATALOG_NAME=your_catalog
# SCHEMA_NAME=your_schema
# ...
# 3. Run locally
python -m src.multi_agent.main --query "test"
How it works:
config.py loads environment variables from .envPros:
Cons:
dev_config.yamlUsed by: Testing in Databricks notebooks Purpose: Test with real Databricks services before deploying
Setup:
# dev_config.yaml (at repo root)
catalog_name: your_catalog
schema_name: your_schema
llm_endpoint: databricks-claude-sonnet-4-5
genie_space_ids:
- space_id_1
- space_id_2
sql_warehouse_id: your_warehouse_id
# ...
How it works:
notebooks/test_agent_databricks.pyPros:
Cons:
prod_config.yamlUsed by: Model Serving deployment Purpose: Production configuration packaged with model
Setup:
# prod_config.yaml (at repo root)
catalog_name: prod_catalog
schema_name: prod_schema
llm_endpoint: databricks-claude-sonnet-4-5
genie_space_ids:
- prod_space_id_1
- prod_space_id_2
sql_warehouse_id: prod_warehouse_id
# ...
How it works:
notebooks/deploy_agent.py (line ~5637)model_config parameterPros:
Cons:
| Feature | Local (.env) | Databricks Test (YAML) | Production (YAML) |
|---|---|---|---|
| Code Location | Local machine | Databricks workspace | Model Serving |
| Config File | .env |
dev_config.yaml |
prod_config.yaml |
| Config Loader | config.py |
dev_config.yaml |
prod_config.yaml |
| Agent Code | src/multi_agent/ |
src/multi_agent/ |
src/multi_agent/ |
| Real Services | ❌ (can mock) | ✅ Yes | ✅ Yes |
| Use Case | Development | Testing | Production |
Key Insight: All three use the same agent code from src/multi_agent/!
All configuration systems need these values:
Databricks Connection:
DATABRICKS_HOST / databricks_host: Your Databricks workspace URLDATABRICKS_TOKEN / databricks_token: Authentication tokenUnity Catalog:
CATALOG_NAME / catalog_name: Unity Catalog catalog nameSCHEMA_NAME / schema_name: Schema name for tablesLLM Endpoints (agent-specific for optimal performance):
llm_endpoint_clarification: Fast model for clarificationllm_endpoint_planning: Smart model for planningllm_endpoint_sql_synthesis_table: Smart model for SQL synthesisllm_endpoint_sql_synthesis_genie: Smart model for Genie queriesllm_endpoint_execution: Fast model for executionllm_endpoint_summarize: Fast model for summarizationGenie & SQL:
genie_space_ids: List of Genie space IDs to querysql_warehouse_id: SQL Warehouse ID for queriesVector Search:
vs_endpoint_name: Vector Search endpoint nameembedding_model: Embedding model for vector searchLakebase (for state management):
lakebase_instance_name: Lakebase instance namelakebase_embedding_endpoint: Embedding endpointlakebase_embedding_dims: Embedding dimensionssample_size: Number of samples for enrichment (default: 100)max_unique_values: Max unique values to capture (default: 50)# 1. Develop locally with .env
python -m src.multi_agent.main --query "test"
# 2. Sync code to Databricks
databricks workspace import-dir src/multi_agent /Workspace/src/multi_agent
# 3. Open notebooks/test_agent_databricks.py in Databricks
# (Uses dev_config.yaml automatically)
# 1. Test with dev_config.yaml
# Run notebooks/test_agent_databricks.py
# 2. Update prod_config.yaml if needed
# Compare with dev_config.yaml, adjust for production
# 3. Deploy with prod_config.yaml
# Run notebooks/deploy_agent.py
# (References prod_config.yaml at line ~5637)
Use smaller/cheaper resources:
Use production resources:
.env files
.gitignore.env and YAML configsSymptom: FileNotFoundError or KeyError
Solutions:
.env file exists and has all required valuesSymptom: Using dev config in production
Solutions:
model_config parameter in deploy_agent.pySymptom: Authentication errors
Solutions:
DATABRICKS_HOST includes https://Questions? See full guide at docs/ or ask in discussions!