13. CI/CD and DevOps

You'll learn how to split DevOps responsibilities between Terraform and Declarative Automation Bundles (DABs), and how to run CI/CD for Databricks projects in ~15 min.

Prereqs: 7. Build the first pipeline: DABs, 8. Automation & orchestration: DABs, 3. Infra setup

Why this matters

Draw one line and most of the confusion goes away: anything outside the workspace is Terraform's job, anything inside it is the DABs job. Get that line wrong and you end up with brittle scripts, environments that drift apart, and a deploy only one person knows how to run.

Clicking changes into the Workspace by hand works until it doesn't. The moment a second person needs to ship, or you need staging to match prod, you want repeatable deploys, a review gate, and a way to promote a change from dev to prod. That is the same discipline that keeps application code sane, applied to data work.

Journey checklist

The two layers

Databricks DevOps is two separate concerns, each with its own tool and lifecycle.

Layer	What it covers	Tool	Example
Platform infrastructure (external, AWS and Databricks account related)	Accounts, networks, metastores, workspaces, and IAM: anything that lives outside a Databricks workspace	Terraform	Terraform Examples
Databricks Projects (internal, within the Workspace)	Jobs, pipelines, schemas, dashboards, and other assets a Databricks project needs.	Declarative Automation Bundles (DABs)	Build your first pipeline with DABs Bundle configuration examples

What is DABs?

Declarative Automation Bundles (DABs) is infrastructure-as-code for Databricks. The simplest way to think about it: Databricks as code. Project assets live in Git as YAML and source files instead of as clicks someone made in the Workspace and hoped to remember.

IaC for the workspace. DABs defines jobs, pipelines, schemas, and related resources in config files. The bundle is what you deploy: one repo, one project, several environment targets.
A software-engineering workflow. A Databricks project lives the same way application code does, with branches, pull requests, and CI/CD. Your team picks the branching model. Default to trunk-based development, since it keeps main deployable and spares you long-lived branches that fight to merge.
CLI-native. DABs ships with the Databricks CLI. Install the CLI on a CI/CD runner such as a GitHub Actions worker in one step, then run databricks bundle deploy from the pipeline.
CI/CD in practice. A typical pipeline validates the bundle, runs tests, and deploys to a target workspace. See Create a GitHub Actions workflow for CI/CD for a full example.

DABs in the Workspace

DABs in VSCode

Start at Minute 20:50

GitHub code examples

Repo	What's inside
bundle-examples / knowledge_base	Official Databricks reference library. Covers Genie Agents, Metric Views, Apps, Lakebase, Jobs, Pipelines, Models, Model Serving Endpoints, and Vector Search indexes. Good first stop for any bundle pattern.
databricks-dab-examples / flights	End-to-end worked example built around a flights dataset. Comes in three tiers (simple, advanced, bundle template) so you can follow the progression from a minimal bundle to a production-ready project.
databricks-dab-examples / knowledge-base	Solutions-team reference examples. Includes an Azure DevOps CI/CD pipeline, a React + Lakebase app, metric views, a uv-managed bundle, and a DAIS 2024 modular orchestration template.

How to migrate existing Workspace assets to DABs?

The following items are covered in the video:

Create the DABs project and base file structure.
Migrate workspace assets to DABs.
Create a base CI/CD pipeline for your preferred DevOps tool.

warning

The Genie Code skill presented here is not a official Databricks-supported tool. Validate generated bundles in a non-production workspace before you rely on them in CI/CD.

Monolithic-repo or multiple repos?

Use one Git repo per Databricks project: one bundle, one deployment boundary, one owning team.

A single repo that holds every Databricks project in the org looks tidy. It isn't. The cost shows up as soon as a second team starts committing to it:

❌ Merge conflicts pile up when unrelated teams touch shared folders, CI configs, or bundle targets.
❌ CI gets slow and noisy. A change to one team's pipeline kicks off validation for every project in the repo.
❌ Ownership blurs. When a deploy fails, there is no clear owner, and a rollback drags in assets another team never touched.
❌ Release cadences collide. Team A can't ship a hotfix while Team B is sitting on a long-running feature branch that's holding main.

Example

At Awesome123 corp, two teams kick off separate Databricks projects at the same time. Each gets its own repo, bundle, and CI/CD pipeline.

Team	Project	Repo	Assets
Data Engineering	Marketing C360	`databricks-marketing-c360`	Jobs, pipelines, schemas, SQL warehouses, dashboards
Data Science	Finance revenue prediction	`databricks-finance-revenue-prediction`	Training jobs, registered models, serving endpoints, dashboards

A pipeline change in marketing C360 does not trigger CI for the finance ML project. Each team ships on its own schedule.

When to use / when not to

Situation	Use
Provision workspaces, networks, or cloud IAM	Terraform
Platform settings must match across accounts or regions	Terraform
Deploy Databricks projects (jobs, pipelines, notebooks, schemas)	DABs
Changes need review before reaching production	DABs
One-off notebook or prototype, single owner, nothing downstream depends on it	Neither

Do next: Build your first pipeline with DABs
Learn why: Orchestration with DABs
Reference: CI/CD on Databricks

Why this matters
Journey checklist
The two layers
What is DABs?
DABs in the Workspace
- DABs in VSCode
GitHub code examples
How to migrate existing Workspace assets to DABs?
Monolithic-repo or multiple repos?
- Example
When to use / when not to
Next

Why this matters​

Journey checklist​

The two layers​

What is DABs?​

DABs in the Workspace​

DABs in VSCode​

Start at Minute 20:50​

GitHub code examples​

How to migrate existing Workspace assets to DABs?​

Monolithic-repo or multiple repos?​

Example​

When to use / when not to​

Next​