Soft Deletes
Applies To: |
Pipeline Bundle |
Configuration Scope: |
Data Flow Spec |
Databricks Docs: |
The Lakeflow Framework provides support for soft deletes when processing change data capture (CDC) events using the apply_changes() function. This allows you to handle DELETE operations in your CDC data while maintaining a history of deleted records.
When soft deletes are configured, deleted rows are temporarily retained as tombstones in the underlying Delta table. A view is created in the metastore that filters out these tombstones, providing a “live” view of current records.
Configuration
Soft deletes are configured using the apply_as_deletes parameter in the apply_changes() function. This parameter specifies the condition that indicates when a CDC event should be treated as a DELETE rather than an upsert.
The condition can be specified either as:
A string expression:
"Operation = 'DELETE'"A Spark SQL expression using
expr():expr("Operation = 'DELETE'")