Convert to Delta Lake

The CONVERT TO DELTA SQL command performs a one-time conversion for Parquet and Iceberg tables to Delta Lake tables. For incremental conversion of Parquet or Iceberg tables to Delta Lake, see Incrementally clone Parquet and Iceberg tables to Delta Lake.

Unity Catalog supports the CONVERT TO DELTA SQL command for Parquet and Iceberg tables stored in external locations managed by Unity Catalog.

You can configure existing Parquet data files as external tables in Unity Catalog and then convert them to Delta Lake to unlock all features of the Databricks lakehouse.

For the technical documentation, see CONVERT TO DELTA.

Converting a directory of Parquet or Iceberg files in an external location to Delta Lake

Note

  • Converting Iceberg tables is in Public Preview.
  • Converting Iceberg tables is supported in Databricks Runtime 10.4 LTS and above.
  • Converting Iceberg metastore tables is not supported.
  • Converting Iceberg tables that have experienced partition evolution is not supported.
  • Converting Iceberg merge-on-read tables that have experienced updates, deletions, or merges is not supported.
  • The following are limitations for converting Iceberg tables with partitions defined on truncated columns:
    • In Databricks Runtime 12.2 LTS and below, the only truncated column type supported is string.
    • In Databricks Runtime 13.3 LTS and above, you can work with truncated columns of types string, long, or int.
    • Azure Databricks does not support working with truncated columns of type decimal.

You can convert a directory of Parquet data files to a Delta Lake table as long as you have write access on the storage location. For information on configuring access with Unity Catalog, see Connect to cloud object storage and services using Unity Catalog.

Note

Unity Catalog requires Azure Data Lake Storage Gen2.

CONVERT TO DELTA parquet.`abfss://container@storageAccount.dfs.core.chinacloudapi.cn/parquet-data`;

CONVERT TO DELTA iceberg.`abfss://container@storageAccount.dfs.core.chinacloudapi.cn/iceberg-data`;

To load converted tables as external tables to Unity Catalog, you need the CREATE EXTERNAL TABLE permission on the external location.

Note

For Databricks Runtime 11.3 LTS and above, CONVERT TO DELTA automatically infers partitioning information for tables registered to the Hive metastore. You must provide partitioning information for Unity Catalog external tables.

Converting managed and external tables to Delta Lake on Unity Catalog

CONVERT TO DELTA syntax can only be used for creating Unity Catalog external tables. Use a CTAS statement to convert a legacy Hive metastore managed Parquet table directly to a managed Unity Catalog Delta Lake table, see Upgrade a Hive table to a Unity Catalog managed table using CREATE TABLE AS SELECT.

To upgrade an external Parquet table to a Unity Catalog external table, see Upgrade a single Hive table to a Unity Catalog external table using the upgrade wizard.

After you've registered an external Parquet table to Unity Catalog, you can convert it to an external Delta Lake table. You must provide partitioning information if the Parquet table is partitioned.

CONVERT TO DELTA catalog_name.database_name.table_name;

CONVERT TO DELTA catalog_name.database_name.table_name PARTITIONED BY (date_updated DATE);