Create a storage credential for connecting to Azure Data Lake Storage Gen2

This article describes how to create a storage credential in Unity Catalog to connect to Azure Data Lake Storage Gen2.

To manage access to the underlying cloud storage that holds tables and volumes, Unity Catalog uses the following object types:

  • Storage credentials encapsulate a long-term cloud credential that provides access to cloud storage.
  • External locations contain a reference to a storage credential and a cloud storage path.

For more information, see Connect to cloud object storage using Unity Catalog.

Unity Catalog supports two cloud storage options for Azure Databricks: Azure Data Lake Storage Gen2 containers and Cloudflare R2 buckets. Cloudflare R2 is intended primarily for Delta Sharing use cases in which you want to avoid data egress fees. Azure Data Lake Storage Gen2 is appropriate for most other use cases. This article focuses on creating storage credentials for Azure Data Lake Storage Gen2 containers. For Cloudflare R2, see Create a storage credential for connecting to Cloudflare R2.

To create a storage credential for access to an Azure Data Lake Storage Gen2 container, you create an Azure Databricks access connector that references an Azure managed identity, assigning it permissions on the storage container. You then reference that access connector in the storage credential definition.

Requirements

In Azure Databricks:

  • Azure Databricks workspace enabled for Unity Catalog.

  • CREATE STORAGE CREDENTIAL privilege on the Unity Catalog metastore attached to the workspace. Account admins and metastore admins have this privilege by default.

    Note

    Service principals must have the account admin role to create a storage credential that uses a managed identity. You cannot delegate CREATE STORAGE CREDENTIAL to a service principal. This applies to both Azure Databricks service principals and Microsoft Entra ID (formerly Azure Active Directory) service principals.

In your Azure tenant:

  • An Azure Data Lake Storage Gen2 storage container in the same region as the workspace you want to access the data from.

    The Azure Data Lake Storage Gen2 storage account must have a hierarchical namespace.

  • Contributor or Owner of an Azure resource group.

  • Owner or a user with the User Access Administrator Azure RBAC role on the storage account.

Create a storage credential using a managed identity

You can use either an Azure managed identity or a service principal as the identity that authorizes access to your storage container. Managed identities are strongly recommended. They have the benefit of allowing Unity Catalog to access storage accounts protected by network rules, which isn't possible using service principals, and they remove the need to manage and rotate secrets. If you want to use a service principal, see Create Unity Catalog managed storage using a service principal (legacy).

  1. In the Azure portal, create an Azure Databricks access connector and assign it permissions on the storage container that you would like to access, using the instructions in Configure a managed identity for Unity Catalog.

    An Azure Databricks access connector is a first-party Azure resource that lets you connect managed identities to an Azure Databricks account. You must have the Contributor role or higher on the access connector resource in Azure to add the storage credential.

    Make a note of the access connector's resource ID.

  2. Log in to your Unity Catalog-enabled Azure Databricks workspace as a user who has the CREATE STORAGE CREDENTIAL privilege.

    The metastore admin and account admin roles both include this privilege. If you are logged in as a service principal (whether an Microsoft Entra ID or native Azure Databricks service principal), you must have the account admin role to create a storage credential that uses a managed identity.

  3. Click Catalog icon Catalog.

  4. Click the +Add button and select Add a storage credential from the menu.

    This option does not appear if you don't have the CREATE STORAGE CREDENTIAL privilege.

  5. Select a Credential Type of Azure Managed Identity.

  6. Enter a name for the credential, and enter the access connector's resource ID in the format:

    /subscriptions/<subscription-id>/resourceGroups/<resource-group>/providers/Microsoft.Databricks/accessConnectors/<connector-name>
    
  7. (Optional) If you created the access connector using a user-assigned managed identity, enter the resource ID of the managed identity in the User-assigned managed identity ID field, in the format:

    /subscriptions/<subscription-id>/resourceGroups/<resource-group-name>/providers/Microsoft.ManagedIdentity/userAssignedIdentities/<managed-identity-name>
    
  8. (Optional) If you want users to have read-only access to the external locations that use this storage credential, select Read only. For more information, see Mark a storage credential as read-only.

  9. Click Save.

  10. Create an external location that references this storage credential.

Next steps

You can view, update, delete, and grant other users permission to use storage credentials. See Manage storage credentials.

You can define external locations using storage credentials. See Create an external location to connect cloud storage to Azure Databricks.