Skip to main content

GCP — GCS

You'll create a storage credential and external location to connect Databricks to GCS in ~10 min.

Prereqs: Cloud Object Storage overview, GCP project with Owner permissions

What you'll build

A Google Cloud Storage bucket connected to Databricks through a Unity Catalog storage credential and external location.

Prerequisites

  • A GCP project with Owner or Storage Admin permissions.
  • A Databricks workspace with metastore-admin or account-admin privileges.

Steps

1. Follow the official guide

The Databricks documentation covers the full workflow for creating a GCS storage credential and external location:

Connect to a Google Cloud Storage (GCS) external location

The guide walks through creating a service account, granting it access to your GCS bucket, registering the storage credential in Databricks, and creating the external location.

2. Mark the external location as read-only

After the external location is created, mark it as read-only. This prevents any Databricks workload from writing to the storage path, protecting your source data from accidental modifications.

Follow the guide: Mark an external location as read-only.

warning

Skipping this step leaves the external location writable by any principal with write grants. Always set it to read-only unless your pipeline explicitly needs to write back to this path.

Verify

  1. In the Databricks workspace, navigate to Catalog > External Data > External Locations.
  2. Click the new external location and click Test Connection.
  3. Confirm the test returns a success status.

Troubleshoot

Test Connection fails with permission denied

Verify the service account used in the storage credential has the storage.objectAdmin role (or equivalent) on the target GCS bucket.

Storage credential creation fails

Confirm the service account key or workload identity federation is configured correctly. The service account must be in the same GCP project as the Databricks workspace.

Next