Skip to main content

3. Infra Setup

You'll deploy the core Databricks infrastructure (workspaces, identity, and governance) in this section.

Prereqs: Before you Start

Why this matters

Do this part in order, and do it before anyone starts loading data. Skip the workspace layout, central identity, and metastore ownership now and you will be untangling them later, with real users and real tables in the way. The pieces below build on each other, so each section assumes the last one is done.

Journey checklist

  • Get started.
  • Before you start.
  • Infra setup
    • Create workspaces.
    • Add users.
    • Add groups.
    • Change ownership to metastore admins.
    • Activate SSO.
  • Cost monitoring.
  • Data Governance Strategy.
  • Access your data.
  • Build the first pipeline.
  • Automation and orchestration.
  • Query and explore.
  • Databricks AI/BI.
  • Business semantics.

What you'll set up

Work through these in order. Each one assumes the previous is done.

  • Create Workspaces: deploy workspaces on AWS, Azure, or GCP, by hand or with Terraform.
  • Add Users: register users by hand or through SCIM provisioning.
  • Add Groups: create groups that map to data personas and assign users to them.
  • Metastore Admins: set the admin group and move UC asset ownership to it.
  • Activate SSO: wire single sign-on to your identity provider.

Next