Use external Hive Metastore for Synapse Spark Pool

Note

External Hive metastores will no longer be supported in subsequent versions after Azure Synapse Runtime for Apache Spark 3.4 in Synapse.

Azure Synapse Analytics allows Apache Spark pools in the same workspace to share a managed HMS (Hive Metastore) compatible metastore as their catalog. When customers want to persist the Hive catalog metadata outside of the workspace, and share catalog objects with other computational engines outside of the workspace, such as HDInsight and Azure Databricks, they can connect to an external Hive Metastore. In this article, you can learn how to connect Synapse Spark to an external Apache Hive Metastore.

Supported Hive Metastore versions

The feature works with Spark 3.3. The following table shows the supported Hive Metastore versions for each Spark version.

Spark Version HMS 2.3.x HMS 3.1.X
3.3(deprecated) Yes Yes
3.4 Yes Yes

Set up linked service to Hive Metastore

Note

Only Azure SQL Database and Azure Database for MySQL are supported as an external Hive Metastore. SQL(username-password) authentication is supported for both kinds of databases. Additionally, system-sssigned managed identity authentication is supported only for Azure SQL Database and Spark 3.4+. If the provided database is blank, please provision it via Hive Schema Tool to create database schema.

Follow below steps to set up a linked service to the external Hive Metastore in Synapse workspace.

  1. Open Synapse Studio, go to Manage > Linked services at left, click New to create a new linked service.

    Screenshot of set up Hive Metastore linked service.

  2. Choose Azure SQL Database, click Continue.

  3. Provide Name of the linked service. Record the name of the linked service, this info will be used to configure Spark shortly.

  4. Choose Legacy version and select Connection String.

  5. Either select Azure SQL Database for the external Hive Metastore from Azure subscription list, or enter the info manually.

  6. Set Authentication type as one of SQL Authentication or System-assigned managed identity. For SQL Authentication, provide User name and Password to set up the connection. For System-assigned managed identity, the page will automatically populate the management identity associated with the current workspace.

  7. Test connection to verify the authentication.

  8. Click Create to create the linked service.