Logging
=======

.. list-table::
   :header-rows: 0

   * - **Applies To:**
     - :bdg-success:`Pipeline Bundle`
   * - **Configuration Scope:**
     - :bdg-success:`Pipeline`
   * - **Databricks Docs:**
     - NA

The Lakeflow Framework provides structured logging capabilities to help track pipeline execution and troubleshoot issues. Logging is implemented using Python's standard ``logging`` module with custom configuration.

Log Levels
---------- 

The framework supports standard Python logging levels:

- DEBUG: Detailed information for debugging
- INFO: General information about pipeline execution
- WARNING: Warning messages for potential issues
- ERROR: Error messages for failed operations
- CRITICAL: Critical errors that may cause pipeline failure

Configuration
-------------

The default log level for all pipelines is ``INFO``. 
To specify a different log level, you can set the ``logLevel`` parameter in the `Configuration` section of a Spark Declarative Pipeline. 
You can do this in one in one of the two ways described below.

Setting the Log Level in the Pipeline Yaml
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The log level can be configured when creating a pipeline yaml in the resources folder of a pipeline bundle. 
This is done by adding the ``logLevel`` parameter in the configuration section of the pipeline.yaml, per the below screenshot. 

.. image:: images/screenshot_pipeline_log_level_yaml.png

Setting the Log Level in the Databricks UI
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The log level can also be manually set at any time in the Databricks UI. 
To do so, browse to your desired Pipeline, open the Pipeline setting and add the ``logLevel`` in the `Advanced Configuration` section as shown below:

.. image:: images/screenshot_pipeline_log_level_ui.png

Permissions to View Logs
^^^^^^^^^^^^^^^^^^^^^^^^

By default, only the pipeline owner has permissions to view the logs for a given pipeline execution. 

To grant other users access to the logs, you must configure the add the below spark configuration to the Framework using the :doc:`feature_spark_configuration` feature of the Framework.

.. code-block:: text

    "spark.databricks.acl.needAdminPermissionToViewLogs": "false"

This is documented in the Databricks documentation here: https://docs.databricks.com/en/compute/clusters-manage.html

Viewing the Logs
----------------

The logs can be viewed in the Databricks UI by: 

1. Browsing to the desired Pipeline.
2. Selecting the desired Update ID (pipeline execution). 
3. Selecting the `Update` tab on the right hand side of the UI and then clicking on the `Logs` link at the bottom of the tab.

    .. image:: images/screenshot_logs_viewing_1.png

4. A new browser tab will open displaying the log in the STDOUT section as shown below:

    .. image:: images/screenshot_logs_viewing_2.png

Example Log Messages
------------------

The framework logs various types of information:

Pipeline Initialization:

.. code-block:: text

    2025-02-06 04:05:46,161 - DltFramework - INFO - Initializing Pipeline...
    2025-02-06 04:05:46,772 - DltFramework - INFO - Retrieving Global Framework Config From: {path}
    2025-02-06 04:05:46,908 - DltFramework - INFO - Retrieving Pipeline Configs From: {path}

Flow Creation:

.. code-block:: text

    2025-02-06 04:05:48,254 - DltFramework - Creating Flow: flow_name
    2025-02-06 04:05:48,254 - DltFramework - Creating View: view_name, mode: stream, source type: delta

Error Handling:

.. code-block:: text

    2025-02-06 04:06:26,527 - ERROR - DltFramework - Failed to process Data Flow Spec: {error_details}