Skip to main content

Cloud Object Storage

info

Access data in cloud object storage (S3, ADLS, GCS).

UC Storage Credentials and External Locations

To access your data stored on your cloud object storage data you need two Unity Catalog objects:

  • Storage Credential:

    • Authenticates your Databricks compute to your cloud object storage.
    • Depending on your cloud provider, it encapsulates a:
      • IAM Role (AWS and GCP).
      • Managed Identity (Azure).
  • External Location: Defines the path to your data.

    • AWS: s3://mybucket/mydepartment/mydataset/
    • Azure: abfss://mydepartmentcontainer@mystorageaccount.dfs.core.windows.net/mydataset/
    • GCP: gs://mybucket/mydepartment/mydataset/

More detailed information can be found in the official Databricks documentation:

Create storage locations and external locations

  • AWS – Access data in Amazon S3.
  • Azure – Access data in Azure Data Lake Storage (ADLS).
  • GCP – Access data in Google Cloud Storage (GCS).