Tags and attribution
You'll tag classic compute and configure serverless usage policies so
system.billing.usagerolls up by team or project in ~25 min.Prereqs: Infra setup
What you'll walk away with
Consistent custom_tags on your billable rows. Classic clusters, warehouses, and pools get tagged with custom tags. Serverless notebooks, jobs, Lakeflow pipelines, and serving endpoints get tagged through serverless usage policies (the docs also call these budget policies, and the billing column is usage_metadata.budget_policy_id).
Prerequisites
- Workspace admin for tagging compute and creating serverless usage policies.
- Account admin for workspace-level tags through the Account API and for tag-aware budgets on Budget alerts.
Tags apply from creation forward. Historical rows stay untagged. Start early if you need chargeback.
Serverless compute
Public Preview: serverless usage policies
Serverless notebooks, jobs, Lakeflow pipelines, and model serving pick up tags from policies, not from cluster tags.
- Avatar → Settings → Compute.
- Next to Serverless usage policies, click Manage.
- Create → name the policy → add tag pairs (for example
team:data-engineering). - Permissions → Grant access → assign User or Manager roles.
If a user has one policy assigned, it auto-attaches. With more than one, they pick at creation time. If they pick nothing, the UI may default to the first policy alphabetically. Either way, only new usage gets the tags.
Classic compute
Custom tags apply to clusters, SQL warehouses, pools, and job compute (GA).
Cluster: Compute → cluster → Edit → Advanced options → Tags → add keys and values → confirm or restart.
SQL warehouse: SQL Warehouses → warehouse → Edit → Tags → save.
Pools and jobs: use the Pools UI or Jobs compute tags. Bundles allow up to 25 tags per job definition.
Workspace tags: account admins only. PATCH workspaces with custom_tags through the Account API.
The default tags (Vendor, ClusterId, ClusterName, Creator, RunName, and JobId on job compute) stay automatic.
Video: Tagging clusters for cost attribution. The walkthrough says Azure in the title, but the same tagging flow applies on AWS and GCP workspaces.
Compute policies can require tags at cluster creation (Compute → Policies). For the policy JSON and the limits, see Create and manage compute policies and the policy reference.
Limits and cloud rules
- Allowed characters: letters, digits, and
+ - = . , _ : @. No spaces, no/. - Up to 20 custom tags per workspace-managed compute resource. Bundles extend jobs separately.
- Do not use the reserved key
Namefor custom tags. - Cluster tag edits often need a restart to reach the cloud instances. Workspace tags can lag up to one hour.
- Pool workloads propagate workspace and pool tags to the cloud VMs. Cluster-only tags still show up in Databricks billing.
- A key that matches a default key may gain an
x_prefix in the cloud. A policy conflict can hard-fail cluster creation instead.
GCP labels are more restrictive (length, lowercase). Expect truncation on email-like values.
Example queries
Cost by team tag:
SELECT
custom_tags['team'] AS team,
SUM(u.usage_quantity * lp.pricing.effective_list.default) AS estimated_cost_usd
FROM system.billing.usage u
JOIN system.billing.list_prices lp
ON u.sku_name = lp.sku_name
AND u.cloud = lp.cloud
AND u.usage_start_time >= lp.price_start_time
AND (u.usage_end_time <= lp.price_end_time OR lp.price_end_time IS NULL)
WHERE u.usage_date >= CURRENT_DATE - INTERVAL 30 DAY
GROUP BY 1
ORDER BY estimated_cost_usd DESC;
Untagged classic clusters (gap hunt):
SELECT
workspace_id,
sku_name,
usage_metadata.cluster_id,
SUM(usage_quantity) AS total_dbus
FROM system.billing.usage
WHERE usage_date >= CURRENT_DATE - INTERVAL 30 DAY
AND custom_tags['team'] IS NULL
AND usage_metadata.cluster_id IS NOT NULL
GROUP BY 1, 2, 3
ORDER BY total_dbus DESC;
Serverless usage by budget_policy_id:
SELECT
usage_metadata.budget_policy_id,
billing_origin_product,
SUM(u.usage_quantity * lp.pricing.effective_list.default) AS estimated_cost_usd
FROM system.billing.usage u
JOIN system.billing.list_prices lp
ON u.sku_name = lp.sku_name
AND u.cloud = lp.cloud
AND u.usage_start_time >= lp.price_start_time
AND (u.usage_end_time <= lp.price_end_time OR lp.price_end_time IS NULL)
WHERE u.usage_date >= CURRENT_DATE - INTERVAL 30 DAY
AND u.usage_metadata.budget_policy_id IS NOT NULL
GROUP BY 1, 2
ORDER BY estimated_cost_usd DESC;
More patterns: Top 10 queries to use with System Tables.
Verify
- Tag a cluster
test_tag:verification, run some work, wait 2 to 4 hours, then filtersystem.billing.usageon that map key. - Assign yourself a serverless usage policy, run serverless work, then confirm
budget_policy_idis populated. - In AWS Cost Explorer (or its equivalent), confirm the propagated tags when classic compute backs the bill.
Where people trip
Tags missing on serverless rows
Classic tags never apply to fully serverless runs. Use a serverless usage policy.
Cluster creation fails inside a policy
Rename the conflicting keys. For example, use x_vendor instead of colliding with the defaults.
Cloud billing lacks cluster tags on pooled workloads
Move the tags to the pool or the workspace, or rely on Databricks system.billing.usage for attribution.
Policy never attaches to an old notebook
Policies are not retroactive. Update the notebook compute selector (More…) to pick the policy.
Next
- Do next: Budget alerts
- Learn why: Unity Catalog foundations
- Reference: Use tags to attribute and track usage