Multi-Agent System for Cross-Domain Queries with Databricks Genie.
This system enables intelligent querying across multiple data domains (patients, medications, diagnoses, treatments, etc.) using a multi-agent architecture built with LangGraph.
User Query
↓
SupervisorAgent (orchestrates all agents)
↓
ThinkingPlanningAgent (analyzes & plans)
├── Vector Search (semantic retrieval)
└── Genie Space Metadata
↓
├─ Single Space → GenieAgent
├─ Multiple Spaces (No Join) → Multiple GenieAgents → Verbal Merge
└─ Multiple Spaces (Join) → Fast/Genie Route → SQLSynthesis → SQLExecution
↓
Response with reasoning
Visual representations of the system architecture:
All diagrams are in the architecture/ directory in multiple formats (SVG, PNG, PDF, Mermaid).
Role: Central orchestrator Responsibilities:
Role: Query analysis and planning Responsibilities:
Tools:
Role: Query individual Genie spaces Responsibilities:
Multiple instances: One per Genie space being queried
Role: Combines SQL queries across tables/spaces Responsibilities:
Variants:
SQLSynthesisTableAgent: Direct table queriesSQLSynthesisGenieAgent: Leverages Genie for SQL generationRole: Executes synthesized SQL Responsibilities:
Role: Handles ambiguous queries Responsibilities:
Role: Final response formatting Responsibilities:
User: "Show me patient demographics"
↓
ThinkingPlanningAgent
├─ Vector Search: Finds "patient_demographics" space
└─ Decision: Single space query
↓
GenieAgent (patient_demographics space)
└─ Query Genie space directly
↓
SummarizeAgent
└─ Format response
↓
User: Response with patient demographics
User: "Show me patients and their medications"
↓
ThinkingPlanningAgent
├─ Vector Search: Finds "patients" AND "medications" spaces
└─ Decision: Multiple spaces, no join needed
↓
GenieAgent (patients) + GenieAgent (medications)
└─ Query both spaces in parallel
↓
SummarizeAgent
└─ Verbal merge: "Here are patients... and their medications..."
↓
User: Combined response
User: "Show me patients with high blood pressure AND their medications"
↓
ThinkingPlanningAgent
├─ Vector Search: Finds "patients" AND "medications" spaces
└─ Decision: Multiple spaces, JOIN required
↓
SQLSynthesisAgent
├─ Get table schemas
├─ Get sample data
└─ Generate JOIN query
↓
SQLExecutionAgent
└─ Execute SQL on SQL Warehouse
↓
SummarizeAgent
└─ Format results with reasoning
↓
User: Synthesized response
User: "Show me data"
↓
ThinkingPlanningAgent
└─ Decision: Query too ambiguous
↓
ClarificationAgent
└─ "What type of data are you interested in?"
↓
User: "Patient data"
↓
ThinkingPlanningAgent
└─ Continue with Scenario 1
Uses Lakebase PostgreSQL for conversation state:
Uses Lakebase with semantic search:
class AgentState(TypedDict):
messages: list # Conversation history
relevant_spaces: list # Genie spaces for current query
sql_query: Optional[str] # Generated SQL
sql_results: Optional[dict] # Execution results
final_response: Optional[str] # Final answer
# ... more fields
┌─────────────────────────────────────────────────────┐
│ Databricks Model Serving │
│ ┌───────────────────────────────────────────────┐ │
│ │ Agent Container (Auto-scaled) │ │
│ │ ├─ agent.py (MLflow wrapper) │ │
│ │ ├─ src/multi_agent/ (packaged code) │ │
│ │ └─ prod_config.yaml (runtime config) │ │
│ └───────────────────────────────────────────────┘ │
│ ↓ │
│ ┌───────────────────────────────────────────────┐ │
│ │ Databricks Services │ │
│ │ ├─ Genie Spaces (data querying) │ │
│ │ ├─ Vector Search (semantic retrieval) │ │
│ │ ├─ SQL Warehouse (query execution) │ │
│ │ ├─ Lakebase (state management) │ │
│ │ ├─ Unity Catalog (metadata) │ │
│ │ └─ LLM Endpoints (Claude models) │ │
│ └───────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────┘
databricks.sdk.Config() for automatic authenticationcode_paths parameter packages modulesThe system scales in multiple dimensions:
src/multi_agent/agents/new_agent.pysrc/multi_agent/core/graph.pysrc/multi_agent/tools/new_tool.pyWant to understand the code? See ../src/multi_agent/README.md for code structure guide! 💡