Configure customer-managed keys for DBFS using the Azure CLI
Note
This feature is available only in the Premium plan.
You can use the Azure CLI to configure your own encryption key to encrypt the workspace storage account. This article describes how to configure your own key from Azure Key Vault vaults. For instructions on using a key from Azure Key Vault Managed HSM, see Configure HSM customer-managed keys for DBFS using the Azure CLI.
For more information about customer-managed keys for DBFS, see Customer-managed keys for DBFS root.
Install the Azure Databricks CLI extension
Install the Azure Databricks CLI extension.
az extension add --name databricks
Prepare a new or existing Azure Databricks workspace for encryption
Replace the placeholder values in brackets with your own values. The <workspace-name>
is the resource name as displayed in the Azure portal.
az cloud set -n AzureChinaCloud
az login
# az cloud set -n AzureCloud //means return to Public Azure.
az account set --subscription <subscription-id>
Prepare for encryption during workspace creation:
az databricks workspace create --name <workspace-name> --location <workspace-location> --resource-group <resource-group> --sku premium --prepare-encryption
Prepare an existing workspace for encryption:
az databricks workspace update --name <workspace-name> --resource-group <resource-group> --prepare-encryption
Note the principalId
field in the storageAccountIdentity
section of the command output. You will provide it as the managed identity value when you configure your Key Vault.
For more information about Azure CLI commands for Azure Databricks workspaces, see the az databricks workspace command reference.
Create a new Key Vault
The Key Vault that you use to store customer-managed keys for DBFS root must have two key protection settings enabled, Soft Delete and Purge Protection. To create a new Key Vault with these settings enabled, run the following commands.
Important
The Key Vault must be in the same Azure tenant as your Azure Databricks workspace.
Replace the placeholder values in brackets with your own values.
az keyvault create \
--name <key-vault> \
--resource-group <resource-group> \
--location <region> \
--enable-soft-delete \
--enable-purge-protection
For more information about enabling Soft Delete and Purge Protection using the Azure CLI, see How to use Key Vault soft-delete with CLI.
Configure the Key Vault access policy
Set the access policy for the Key Vault so that the Azure Databricks workspace has permission to access it, using the az keyvault set-policy command.
Replace the placeholder values in brackets with your own values.
az keyvault set-policy \
--name <key-vault> \
--resource-group <resource-group> \
--object-id <managed-identity> \
--key-permissions get unwrapKey wrapKey
Replace <managed-identity>
with the principalId
value that you noted when you prepared your workspace for encryption.
Create a new key
Create a key in the Key Vault using the az keyvault key create command.
Replace the placeholder values in brackets with your own values.
az keyvault key create \
--name <key> \
--vault-name <key-vault>
DBFS root storage supports RSA and RSA-HSM keys of sizes 2048, 3072 and 4096. For more information about keys, see About Key Vault keys.
Configure DBFS encryption with customer-managed keys
Configure your Azure Databricks workspace to use the key you created in your Azure Key Vault.
Replace the placeholder values in brackets with your own values.
key_vault_uri=$(az keyvault show \
--name <key-vault> \
--resource-group <resource-group> \
--query properties.vaultUri \
--output tsv)
key_version=$(az keyvault key list-versions \
--name <key> \ --vault-name <key-vault> \
--query [-1].kid \
--output tsv | cut -d '/' -f 6)
az databricks workspace update --name <workspace-name> --resource-group <resource-group> --key-source Microsoft.KeyVault --key-name <key> --key-vault $key_vault_uri --key-version $key_version
Disable customer-managed keys
When you disable customer-managed keys, your storage account is once again encrypted with Azure-managed keys.
Replace the placeholder values in brackets with your own values and use the variables defined in the previous steps.
az databricks workspace update --name <workspace-name> --resource-group <resource-group> --key-source Default