Publish data from Delta Live Tables to the Hive metastore
You can make the output data of your pipeline discoverable and available to query by publishing datasets to the Hive metastore. To publish datasets to the metastore, enter a schema name in the Target field when you create a pipeline. You can also add a target database to an existing pipeline.
By default, all tables and views created in Delta Live Tables are local to the pipeline. You must publish tables to a target schema to query or use Delta Live Tables datasets outside the pipeline in which they are declared.
To publish tables from your pipelines to Unity Catalog, see Use Unity Catalog with your Delta Live Tables pipelines.
How to publish Delta Live Tables datasets to a schema
You can declare a target schema for all tables in your Delta Live Tables pipeline using the Target schema field in the Pipeline settings and Create pipeline UIs.
You can also specify a schema in a JSON configuration by setting the target
value.
You must run an update for the pipeline to publish results to the target schema.
You can use this feature with multiple environment configurations to publish to different schemas based on the environment. For example, you can publish to a dev
schema for development and a prod
schema for production data.
How to query datasets in Delta Live Tables
After an update completes, you can view the schema and tables, query the data, or use the data in downstream applications.
Once published, Delta Live Tables tables can be queried from any environment with access to the target schema. This includes Databricks SQL, notebooks, and other Delta Live Tables pipelines.
Important
When you create a target
configuration, only tables and associated metadata are published. Views are not published to the metastore.
Exclude tables from target schema
If you need to calculate intermediate tables that are not intended for external consumption, you can prevent them from being published to a schema using the TEMPORARY
keyword. Temporary tables still store and process data according to Delta Live Tables semantics, but should not be accessed outside of the current pipeline. A temporary table persists for the lifetime of the pipeline that creates it. Use the following syntax to declare temporary tables:
SQL
CREATE TEMPORARY LIVE TABLE temp_table
AS SELECT ... ;
Python
@dlt.table(
temporary=True)
def temp_table():
return ("...")