Azure - ADLS
Connect to data in Azure Data Lake Storage (ADLS) from Databricks.
Youtube Walkthrough
Step-by-step guide
How to Create a Storage Credential in Azure
Step 1: Create New Access Connector
- Navigate to the Azure portal
- In the search bar, type Access Connector for Azure Databricks
- Select "Access Connector for Azure Databricks" from the search results

- Click the "Create" button to start the Access Connector creation process

-
Complete the configuration form with the following required information:
- Subscription: Select the Azure subscription where the access connector will be deployed
- Resource group: Choose the resource group for the access connector
- Name: Provide a descriptive name for the access connector
- Region: Select the same region as your Databricks workspace for optimal performance
-
Click "Review + create" to validate your configuration

- After validation completes successfully, click "Create" to deploy the access connector

- Once deployment is complete, navigate to the newly created Access Connector resource
- Copy the Resource ID from the resource overview page (you'll need this for Databricks configuration)

Step 2: Configure storage credential in Databricks
- Open your Databricks workspace and navigate to the "Catalog" section from the left sidebar

- Click on "External Data" to access external data configuration options

- Navigate to the "Credentials" tab and click "Create credential"

-
Complete the credential configuration form:
- Credential Name: Provide a descriptive name for the storage credential
- Authentication Type: Select "Azure Managed Identity"
- Access Connector ID: Paste the Resource ID copied from Step 4
- Description: Optional description for documentation purposes
-
Click "Create" to establish the storage credential
External Locations
An external location defines a secure path to your data stored in Azure cloud object storage. It consists of three components: storage account, container, and folder path.
Key Concepts
- Multiple locations: You can configure multiple external locations within your metastore
- Granular permissions: Each external location can have different access permissions
- Data isolation: External locations enable you to organize data by environment, business unit, or region
Storage Account Requirements
Your Azure storage account must meet these requirements:
- Hierarchical namespace: Must be enabled
- Azure Data Lake Storage Gen2: Required for Unity Catalog integration
Planning Your Storage Strategy
You have two options for storage accounts:
- Use existing storage account: If you already have a compliant storage account
- Create new storage account: Recommended for new implementations or specific isolation requirements
How to Create a New Storage Account
- Navigate to the Azure portal
- In the search bar, type Storage Account
- Select "Storage Account" from the search results
- Click "Create" to begin the storage account creation process

-
Complete the Basics configuration with the following settings:
- Subscription: Select the same subscription as your Databricks workspace
- Resource group: Choose an appropriate resource group (preferably the same as your workspace)
- Storage account name: Provide a globally unique name (lowercase letters and numbers only)
- Region: Important - Select the same region as your Databricks workspace for optimal performance
- Performance: Standard or Premium (Standard is sufficient for most use cases)
- Redundancy: Choose based on your data durability requirements (LRS, ZRS, GRS, or GZRS)
-
Click "Next" to proceed to advanced settings

- In the Advanced settings tab, configure the following critical settings:
- Hierarchical namespace: Enable this option (required for Unity Catalog)
- Access tier: Hot (recommended for frequently accessed data)

-
Review your configuration and click "Create" to deploy the storage account
-
Once deployment completes, navigate to your new storage account resource
-
Create a container for your data organization following the steps below:

- Navigate into your newly created container
- Click "+ Add Directory" to create organizational folders for your data and Specify a directory name that reflects your data organization strategy (e.g., "bronze", "silver", "gold" for medallion architecture)

Assign Permissions to Access Connector
This step grants your Databricks Access Connector the necessary permissions to read and write data in your storage account.
Access Storage Account IAM Settings
- In the Azure portal, navigate to the storage account you created in the previous section
- In the left sidebar, click on "Access control (IAM)"
- Click "+ Add" and then select "Add role assignment"

Select Storage Blob Data Contributor Role
- In the role assignment wizard:
- Search for "Storage Blob Data Contributor" in the role search bar
- Select this role and click "Next"

Assign Role to Access Connector
- In the Members section:
- Select "Managed identity" as the assignment type
- Click "+ Select members"
- Search for and select your Access Connector
- Click "Select" and then "Review + assign"

How to Create External Locations in Databricks
Access External Data Configuration
- Open your Databricks workspace
- Navigate to "Catalog" in the left sidebar
- Click on "External Data" to access external location management

Create New External Location
- Navigate to the "External Locations" tab
- Click "Create external location" to begin the configuration process

Configure External Location Settings
-
Complete the external location configuration form with the following information:
- External Location Name: Provide a descriptive name (e.g., "raw-data-location")
- Storage Type: Select "Azure Data Lake Storage Gen2"
- URL: Use the format
abfss://<container>@<storage_account>.dfs.core.windows.net/<folder_path>- Replace
<container>with your container name - Replace
<storage_account>with your storage account name - Replace
<folder_path>with your directory path (optional)
- Replace
- Storage credential: Select the storage credential created in the previous section
- Comments: Optional description for documentation purposes
-
Click "Create" to establish the external location

Verify External Location Configuration
- After creation, click "Test Connection" to verify that the external location is configured correctly and accessible
