DABs Definition

You'll deploy a Lakeflow Connect ingestion pipeline with a classic compute gateway using DABs in ~15 min.

Prereqs: Create ingestion pipeline overview, Create connection, DABs CLI

What you'll build

A Lakeflow Connect ingestion pipeline defined as code using Databricks Asset Bundles (DABs). The definition includes a gateway pipeline on classic compute (for network access to the source system) and a serverless ingestion pipeline that writes into UC tables.

Prerequisites

A Unity Catalog connection to the source database (see Create connection).
Databricks CLI installed and configured.
A service principal or user with permissions to create pipelines in the target workspace.
Check feature availability for your data source.

Steps

1. Define the pipeline in your DABs YAML

The example below is based on Create pipeline for PostgreSQL. The key difference from the default setup is the explicit classic compute specification on the gateway pipeline.

Check the Databricks documentation left panel for other connectors (MySQL, SQL Server, Salesforce, etc.).

Add this to your DABs resources block:

variables:
  gateway_name:
    default: postgresql_gateway_pipeline
  pipeline_name:
    default: postgresql_pipeline
  dest_catalog:
    default: development
  dest_schema:
    default: c360_source

resources:
  pipelines:
    gateway:
      name: ${var.gateway_name}
      gateway_definition:
        connection_name: <my-connection>
        gateway_storage_catalog: development
        gateway_storage_schema: ${var.dest_schema}
        gateway_storage_name: ${var.gateway_name}
      target: ${var.dest_schema}
      catalog: ${var.dest_catalog}

      clusters:
        - label: default
          # AWS instance types — for Azure use Standard_DS3_v2 / Standard_DS4_v2
          driver_node_type_id: r5.xlarge
          # AWS instance types — for Azure use Standard_E8ds_v4 / Standard_E16ds_v4
          node_type_id: m5.xlarge
          autoscale:
            min_workers: 2
            max_workers: 4
            mode: ENHANCED

    pipeline_postgresql:
      name: ${var.pipeline_name}
      ingestion_definition:
        ingestion_gateway_id: ${resources.pipelines.gateway.id}

        source_type: POSTGRESQL
        objects:
          - table:
              source_catalog: your_database
              source_schema: public
              source_table: orders
              destination_catalog: ${var.dest_catalog}
              destination_schema: ${var.dest_schema}
          - schema:
              source_catalog: your_database
              source_schema: public
              destination_catalog: ${var.dest_catalog}
              destination_schema: ${var.dest_schema}
        source_configurations:
          - catalog:
              source_catalog: your_database
              postgres:
                slot_config:
                  slot_name: db_slot
                  publication_name: db_pub
      target: ${var.dest_schema}
      catalog: ${var.dest_catalog}

danger

The provider must use a workspace-level connection. The service principal deploying the pipeline needs workspace-admin access. See REST API — Create a pipeline for attribute definitions.

2. Replace placeholders

Replace <my-connection> with the name of your UC connection.
Replace your_database, public, and orders with your actual source database, schema, and table names.
Adjust dest_catalog and dest_schema to match your governance model.
For Azure, swap the instance types to the Azure equivalents noted in the comments.

3. Deploy the pipeline

databricks bundle deploy

4. Start the pipeline

databricks bundle run pipeline_postgresql

Verify

In the Databricks workspace, navigate to Workflows > Delta Live Tables.
Confirm both the gateway pipeline and the ingestion pipeline show a Running or Completed status.
Navigate to Catalog > the target catalog and schema. Confirm the ingested tables appear with data.

Troubleshoot

Gateway pipeline fails to start

The gateway runs on classic compute in your VPC. Verify the cluster configuration (instance types, autoscale settings) and that the workspace has permission to launch clusters with those instance types.

Ingestion pipeline cannot connect to the source

The gateway cluster must have network access to the source database. Verify VPC peering, security group rules, and firewall settings. The connection object must also have valid credentials.

Permission error when creating the pipeline

The service principal or user running databricks bundle deploy needs workspace-admin access or explicit permission to create pipelines. Verify permissions in the workspace admin settings.

Do next: Build the first pipeline
Learn why: Unity Catalog foundations
Reference: Lakeflow Connect — Databricks docs

What you'll build​

Prerequisites​

Steps​

1. Define the pipeline in your DABs YAML​

2. Replace placeholders​

3. Deploy the pipeline​

4. Start the pipeline​

Verify​

Troubleshoot​

Next​