DABs Definition
You'll deploy a Lakeflow Connect ingestion pipeline with a classic compute gateway using DABs in ~15 min.
Prereqs: Create ingestion pipeline overview, Create connection, DABs CLI
What you'll build
A Lakeflow Connect ingestion pipeline defined as code using Databricks Asset Bundles (DABs). The definition includes a gateway pipeline on classic compute (for network access to the source system) and a serverless ingestion pipeline that writes into UC tables.
Prerequisites
- A Unity Catalog connection to the source database (see Create connection).
- Databricks CLI installed and configured.
- A service principal or user with permissions to create pipelines in the target workspace.
- Check feature availability for your data source.
Steps
1. Define the pipeline in your DABs YAML
The example below is based on Create pipeline for PostgreSQL. The key difference from the default setup is the explicit classic compute specification on the gateway pipeline.
Check the Databricks documentation left panel for other connectors (MySQL, SQL Server, Salesforce, etc.).
Add this to your DABs resources block:
variables:
gateway_name:
default: postgresql_gateway_pipeline
pipeline_name:
default: postgresql_pipeline
dest_catalog:
default: development
dest_schema:
default: c360_source
resources:
pipelines:
gateway:
name: ${var.gateway_name}
gateway_definition:
connection_name: <my-connection>
gateway_storage_catalog: development
gateway_storage_schema: ${var.dest_schema}
gateway_storage_name: ${var.gateway_name}
target: ${var.dest_schema}
catalog: ${var.dest_catalog}
clusters:
- label: default
# AWS instance types — for Azure use Standard_DS3_v2 / Standard_DS4_v2
driver_node_type_id: r5.xlarge
# AWS instance types — for Azure use Standard_E8ds_v4 / Standard_E16ds_v4
node_type_id: m5.xlarge
autoscale:
min_workers: 2
max_workers: 4
mode: ENHANCED
pipeline_postgresql:
name: ${var.pipeline_name}
ingestion_definition:
ingestion_gateway_id: ${resources.pipelines.gateway.id}
source_type: POSTGRESQL
objects:
- table:
source_catalog: your_database
source_schema: public
source_table: orders
destination_catalog: ${var.dest_catalog}
destination_schema: ${var.dest_schema}
- schema:
source_catalog: your_database
source_schema: public
destination_catalog: ${var.dest_catalog}
destination_schema: ${var.dest_schema}
source_configurations:
- catalog:
source_catalog: your_database
postgres:
slot_config:
slot_name: db_slot
publication_name: db_pub
target: ${var.dest_schema}
catalog: ${var.dest_catalog}
The provider must use a workspace-level connection. The service principal deploying the pipeline needs workspace-admin access. See REST API — Create a pipeline for attribute definitions.
2. Replace placeholders
- Replace
<my-connection>with the name of your UC connection. - Replace
your_database,public, andorderswith your actual source database, schema, and table names. - Adjust
dest_cataloganddest_schemato match your governance model. - For Azure, swap the instance types to the Azure equivalents noted in the comments.
3. Deploy the pipeline
databricks bundle deploy
4. Start the pipeline
databricks bundle run pipeline_postgresql
Verify
- In the Databricks workspace, navigate to Workflows > Delta Live Tables.
- Confirm both the gateway pipeline and the ingestion pipeline show a Running or Completed status.
- Navigate to Catalog > the target catalog and schema. Confirm the ingested tables appear with data.
Troubleshoot
Gateway pipeline fails to start
The gateway runs on classic compute in your VPC. Verify the cluster configuration (instance types, autoscale settings) and that the workspace has permission to launch clusters with those instance types.
Ingestion pipeline cannot connect to the source
The gateway cluster must have network access to the source database. Verify VPC peering, security group rules, and firewall settings. The connection object must also have valid credentials.
Permission error when creating the pipeline
The service principal or user running databricks bundle deploy needs workspace-admin access or explicit permission to create pipelines. Verify permissions in the workspace admin settings.
Next
- Do next: Build the first pipeline
- Learn why: Unity Catalog foundations
- Reference: Lakeflow Connect — Databricks docs