Configure customer-managed keys for DBFS using the Azure CLI

Note

This feature is available only in the Premium plan.

You can use the Azure CLI to configure your own encryption key to encrypt the workspace storage account. This article describes how to configure your own key from Azure Key Vault vaults. For instructions on using a key from Azure Key Vault Managed HSM, see Configure HSM customer-managed keys for DBFS using the Azure CLI.

For more information about customer-managed keys for DBFS, see Customer-managed keys for DBFS root.

Install the Azure Databricks CLI extension

  1. Install the Azure CLI.

  2. Install the Azure Databricks CLI extension.

    az extension add --name databricks
    

Prepare a new or existing Azure Databricks workspace for encryption

Replace the placeholder values in brackets with your own values. The <workspace-name> is the resource name as displayed in the Azure portal.

az cloud set -n AzureChinaCloud
az login
# az cloud set -n AzureCloud   //means return to Public Azure.
az account set --subscription <subscription-id>

Prepare for encryption during workspace creation:

az databricks workspace create --name <workspace-name> --location <workspace-location> --resource-group <resource-group> --sku premium --prepare-encryption

Prepare an existing workspace for encryption:

az databricks workspace update --name <workspace-name> --resource-group <resource-group> --prepare-encryption

Note the principalId field in the storageAccountIdentity section of the command output. You will provide it as the managed identity value when you configure your Key Vault.

For more information about Azure CLI commands for Azure Databricks workspaces, see the az databricks workspace command reference.

Create a new Key Vault

The Key Vault that you use to store customer-managed keys for DBFS root must have two key protection settings enabled, Soft Delete and Purge Protection. To create a new Key Vault with these settings enabled, run the following commands.

Important

The Key Vault must be in the same Azure tenant as your Azure Databricks workspace.

Replace the placeholder values in brackets with your own values.

az keyvault create \
        --name <key-vault> \
        --resource-group <resource-group> \
        --location <region> \
        --enable-soft-delete \
        --enable-purge-protection

For more information about enabling Soft Delete and Purge Protection using the Azure CLI, see How to use Key Vault soft-delete with CLI.

Configure the Key Vault access policy

Set the access policy for the Key Vault so that the Azure Databricks workspace has permission to access it, using the az keyvault set-policy command.

Replace the placeholder values in brackets with your own values.

az keyvault set-policy \
        --name <key-vault> \
        --resource-group <resource-group> \
        --object-id <managed-identity>  \
        --key-permissions get unwrapKey wrapKey

Replace <managed-identity> with the principalId value that you noted when you prepared your workspace for encryption.

Create a new key

Create a key in the Key Vault using the az keyvault key create command.

Replace the placeholder values in brackets with your own values.

az keyvault key create \
       --name <key> \
       --vault-name <key-vault>

DBFS root storage supports RSA and RSA-HSM keys of sizes 2048, 3072 and 4096. For more information about keys, see About Key Vault keys.

Configure DBFS encryption with customer-managed keys

Configure your Azure Databricks workspace to use the key you created in your Azure Key Vault.

Replace the placeholder values in brackets with your own values.

key_vault_uri=$(az keyvault show \
 --name <key-vault> \
 --resource-group <resource-group> \
 --query properties.vaultUri \
--output tsv)
key_version=$(az keyvault key list-versions \
 --name <key> \ --vault-name <key-vault> \
 --query [-1].kid \
--output tsv | cut -d '/' -f 6)
az databricks workspace update --name <workspace-name> --resource-group <resource-group> --key-source Microsoft.KeyVault --key-name <key> --key-vault $key_vault_uri --key-version $key_version

Disable customer-managed keys

When you disable customer-managed keys, your storage account is once again encrypted with Azure-managed keys.

Replace the placeholder values in brackets with your own values and use the variables defined in the previous steps.

az databricks workspace update --name <workspace-name> --resource-group <resource-group> --key-source Default