Skip to main content

AWS Terraform

You'll deploy a Databricks workspace and catalog on AWS using Terraform in ~20 min.

Prereqs: AWS account, Terraform CLI, Databricks account console

What you'll walk away with

A Databricks workspace with a customer-managed VPC (BYOVPC) on AWS plus Unity Catalog resources, all deployed from Terraform. Pick one template from Terraform resources below. You do not need a separate workspace step and catalog step unless you choose the split templates.

The Workspace + Catalog template creates a new VPC only. The Workspace template takes a new or existing VPC.

Prerequisites

  • An AWS account with permission to create IAM roles, S3 buckets, and VPC resources.
  • A Databricks account with account-admin privileges.
  • Terraform CLI installed locally.
  • A service principal with workspace-admin access. This is required for the Catalog template, and for attaching to an existing metastore on the combined template.

YouTube walkthrough

The video walks through the Workspace template (aws-byovpc). The flow is the same for Workspace + Catalog: copy tf/terraform.tfvars.example, set your variables, then run terraform init, plan, and apply per the repo README.

Terraform resources

Open the repository for your scenario and follow its README.md. In each repo, run commands from the tf/ directory: copy terraform.tfvars.example to terraform.tfvars, set your values, then run terraform init && terraform apply.

Repeat for development, staging, and production. Use the prefix or resource_prefix variable to name each environment (for example dev, staging, prod, or a business-unit prefix such as finance_dev).

If you are new to Databricks, start with Workspace + Catalog (first row).

Terraform templateWhat it createsURL
💎 Workspace + Catalog
  • AWS VPC
  • Databricks Cross-account IAM
  • Root S3 bucket
  • Databricks Workspace (uses the previous 3 resources)
  • Storage credential
  • External location
  • Catalog (uses the previous 2 resources)
aws-byovpc-uc
Workspace
  • AWS VPC (new or existing)
  • Cross-account IAM
  • Root S3 bucket
  • Databricks workspace
  • Unity Catalog metastore (optional)
aws-byovpc
Catalog
  • S3 buckets
  • IAM roles and policies
  • Unity Catalog catalogs
  • External locations
  • Storage credentials
  • Targets an existing workspace
uc-quickstart/aws
danger

For the Catalog template only: the Terraform provider must use a workspace-level connection, not account-level. See Authenticating with Databricks-managed Service Principal. The service principal needs workspace-admin access or the apply will fail.

tip

For production, put catalog storage on separate AWS accounts rather than deploying every catalog from one account.

Verify

  1. Log in to the Databricks account console.
  2. Open Workspaces and confirm the new workspace shows Running.
  3. Open the workspace, go to Catalog, and confirm the new catalog appears.

Where people trip

PERMISSION_DENIED: User is not an owner of Metastore while creating catalog

The service principal running Terraform lacks metastore-level permissions. Fix it one of two ways:

  • Option 1: Add the service principal to the metastore admins group.
  • Option 2: Grant just the catalog creation privilege:
GRANT CREATE CATALOG ON METASTORE TO `service_principal_name`;
Terraform apply fails with VPC or IAM errors

Confirm the AWS credentials Terraform uses can create IAM roles, S3 buckets, and VPC resources. If you are using an existing VPC, check that the subnet CIDRs and security groups match the template requirements.

Workspace created but catalog module cannot connect

The catalog module needs a workspace-level Databricks provider, not an account-level one. Confirm the provider block uses the workspace URL and a service principal with workspace-admin access.

Next