Skip to main content

Create connection

info

Create a managed connection to external data sources.

The two purposes of Unity Catalog connection is to:

  1. Create a foreign catalog to push-down queries from Databricks.
    • For use cases where just a small subset of the external data source is required.
  2. Create a managed ingestion pipeline into Unity Catalog using Lakeflow Connect.
    • For use cases that require continuous CDC ingestion from the external data source into Databricks.

Create Connection


Step-by-step tutorial

  1. Navigate to the Databricks workspace

  2. In the left sidebar, click on "Catalog" and then click on "External Data"

warning
  • Click on the ⚙️ icon to access connections.
  • The screenshot needs an update.
external data button

  1. Click on "Connections"
add connection button

  1. Configure your connection:

    • Connection name: Enter a descriptive name for your data source connection
    • Connection type: Select the appropriate connector for your external system (e.g., PostgreSQL, MySQL, SQL Server, Snowflake, etc.)
    • Supported connections: Databricks Lakehouse Federation documentation.
  2. Configure network connectivity:

    • Ensure your external data source is network connectivity from the Databricks VPC / VNet is set.
    • Configure appropriate security groups, VPC settings, or firewall rules.
    • For on-premises systems, you may need VPN or dedicated network connections.
    • Resources:
  3. Test and create the connection:

    • Click Test Connection to verify connectivity
    • Once successful, click Create to save your external data connection