Access Your Data
You'll connect Databricks to your organization's data sources in this section.
Prereqs: Data Governance Strategy
Why this matters
Workspaces, users, groups, and governance are configured. But Databricks cannot process data it cannot reach. This section covers the two paths for connecting to data: direct access to cloud object storage and managed connectors for external systems like databases and SaaS platforms.
Journey checklist
-
Get started. -
Before you start. -
Infra setup. -
Cost monitoring. -
Data Governance Strategy. - Access your data.
- Build the first pipeline.
- Automation and orchestration.
- Query and explore.
- Databricks AI/BI.
- Business semantics.
How it works
Databricks accesses external data through Unity Catalog. Every connection — whether to an S3 bucket or a PostgreSQL database — is registered as a UC object with its own permissions and audit trail.
Two categories cover the most common scenarios:
| Category | Use when |
|---|---|
| Cloud object storage | Your data lives in S3, ADLS, or GCS and you need Databricks to read/write it directly |
| Managed connectors | You need to query or ingest data from external databases, SaaS platforms, or other systems |
Next
- Do next: Cloud object storage
- Learn why: Unity Catalog foundations
- Reference: Connect to data sources — Databricks docs