An Azure Machine Learning workspace defaults to use of a shared key to access its default Azure Storage account. With key-based authorization, anyone who has the key and access to the storage account can access data.
To reduce the risk of unauthorized access, you can disable key-based authorization, and instead use Microsoft Entra ID for authorization. This configuration uses a Microsoft Entra ID value to authorize access to the storage account. The identity used to access storage is either the user's identity or a managed identity. The user's identity is used to view data in the Azure Machine Learning studio, or run a notebook while authenticated with the user's identity. The Azure Machine Learning service uses a managed identity to access the storage account - for example, when running a training job as the managed identity.
Use of your workspace with a shared key disabled storage account is currently in preview.
Important
This feature is currently in public preview. This preview version is provided without a service-level agreement, and we don't recommend it for production workloads. Certain features might not be supported or might have constrained capabilities.
For more information, see Supplemental Terms of Use for Azure Previews.
Prerequisites
Install the SDK v2.
Important
The steps in this article require the azure-ai-ml Python package, version 1.17.0. To determine the installed package version, use the pip list
command from your Python development environment.
Install azure-identity: pip install azure-identity
. If you're working in a notebook cell, use %pip install azure-identity
.
Provide your subscription details:
APPLIES TO:
Python SDK azure-ai-ml v2 (current)
# Enter details of your subscription
subscription_id = "<SUBSCRIPTION_ID>"
resource_group = "<RESOURCE_GROUP>"
Get a handle to the subscription. All the Python code in this article uses ml_client
:
# get a handle to the subscription
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group)
- (Optional) If you have multiple accounts, add the tenant ID of the Microsoft Entra ID you wish to use into the
DefaultAzureCredential
. Find your tenant ID from the Azure portal, under Microsoft Entra ID, External Identities.
DefaultAzureCredential(interactive_browser_tenant_id="<TENANT_ID>")
To use the CLI commands in this document, you need the Azure CLI and the ml extension.
Important
The steps in this article require the Azure CLI extension for machine learning, version 2.27.0. To determine the version of the extension you have installed, use the az version
command from the Azure CLI. In the extensions collection that's returned, find the ml
extension. This code sample shows an example return value:
{
"azure-cli": "2.61.0",
"azure-cli-core": "2.61.0",
"azure-cli-telemetry": "1.1.0",
"extensions": {
"ml": "2.27.0"
}
}
Create a new workspace
When you create a new workspace, the creation process can automatically disable shared key access. Or you can create an Azure Storage account, disable shared key access, and use it during workspace creation.
In Azure Machine Learning studio, select Create with customized networking, encryption identity, dependent resources or tags.
From the Basics tab, select the Storage account you created previously.
From the Identity tab. In the Storage account access section, set Storage account access type to identity-based.
Continue the workspace creation process as usual. As the workspace is created, the managed identity is automatically assigned the permissions it needs to access the storage account.
When you create your workspace with the SDK, set system_datastores_auth_mode="identity"
. To use a pre-existing storage account, use the storage_account
parameter to specify the Azure Resource Manager ID of an existing storage account:
# Creating a unique workspace name with current datetime to avoid conflicts
from azure.ai.ml.entities import Workspace
import datetime
# Azure Resource Manager ID of the storage account
storage_account = "<your_storage_account>"
basic_workspace_name = "mlw-basic-prod-" + datetime.datetime.now().strftime(
"%Y%m%d%H%M"
)
ws_basic = Workspace(
name=basic_workspace_name,
location="chinanorth3",
display_name="Basic workspace-example",
description="This example shows how to create a basic workspace",
hbi_workspace=False,
tags=dict(purpose="demo"),
storage_account=storage_account,
system_datastores_auth_mode="identity"
)
ws_basic = ml_client.workspaces.begin_create(ws_basic).result()
print(ws_basic)
To create a new workspace with Microsoft Entra ID authorization for the storage account, use a YAML configuration file that meets these requirements sets system_datastores_auth_mode
to identity
. You can also specify the Azure Resource ID value of an existing storage account with the storage_account
entry.
This example YAML file shows how to set the workspace to use a managed identity and an existing storage account:
$schema: https://azuremlschemas.azureedge.net/latest/workspace.schema.json
name: mlw-basicex-prod
location: chinanorth3
display_name: Bring your own dependent resources-example
description: This configuration specifies a workspace configuration with existing dependent resources
storage_account: {your storage account resource id}
system_datastores_auth_mode: identity
tags:
purpose: demonstration
This YAML file can be used with the az ml workspace create
command, with the --file
parameter:
az ml workspace create -g <resource-group-name> --file workspace.yml
In the following JSON template example, substitute your own values for the following placeholders:
In the JSON code sample shown here, substitute your own values for
- [workspace name]
- [workspace friendly name]
- [Storage Account ARM resource ID]
- [Key Vault ARM resource ID]
{
"$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"resources":
[
{
"type": "Microsoft.MachineLearningServices/workspaces",
"apiVersion": "2024-04-01",
"name": "[workspace name]",
"location": "[resourceGroup().location]",
"sku":
{
"name": "Basic",
"tier": "Basic"
},
"kind": "Default",
"identity":
{
"type": "SystemAssigned"
},
"properties":
{
"friendlyName": "[workspace friendly name]",
"storageAccount": "[Storage Account ARM resource ID]",
"keyVault": "[Key Vault ARM resource ID]",
"systemDatastoresAuthMode": "identity",
"managedNetwork":
{
"isolationMode": "Disabled"
},
"publicNetworkAccess": "Enabled"
}
}
]
}
For information on deploying an ARM template, use one of the following articles:
After you create the workspace, identify all the users that will use it - for example, Data Scientists. These users must be assigned the Storage Blob Data Contributor and Storage File Data Privileged Contributor roles in Azure role-based access control for the storage account. If these users only need read access, use the Storage Blob Data Reader and Storage File Data Privileged Reader roles instead. For more information, visit the role assignments resource in this document.
Update an existing workspace
If you have an existing Azure Machine Learning workspace, use the steps in this section to update the workspace to use Microsoft Entra ID, to authorize access to the storage account. Then, disable shared key access on the storage account.
To update an existing workspace, go to Properties and select Identity-based access.
Select Save to save this choice.
To update an existing workspace, set the system_datastores_auth_mode = "identity"
for the workspace. This code sample shows an update of a workspace named test-ws1
:
ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group)
ws = ml_client.workspaces.get(name="test-ws1")
ws.system_datastores_auth_mode = "identity"
ws = ml_client.workspaces.begin_update(workspace=ws).result()
To update an existing workspace, use the az ml workspace update
command and specify --system-datastores-auth-mode identity
. This example shows an update of a workspace named myworkspace
:
az ml workspace update --name myworkspace --system-datastores-auth-mode identity
In the following JSON template example, substitute your own values for the following placeholders:
- [workspace name]
- [workspace friendly name]
- [Storage Account ARM resource ID]
- [Key Vault ARM resource ID]
{
"$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"resources":
[
{
"type": "Microsoft.MachineLearningServices/workspaces",
"apiVersion": "2024-04-01",
"name": "[workspace name]",
"location": "[resourceGroup().location]",
"sku":
{
"name": "Basic",
"tier": "Basic"
},
"kind": "Default",
"identity":
{
"type": "SystemAssigned"
},
"properties":
{
"friendlyName": "[workspace friendly name]",
"storageAccount": "[Storage Account ARM resource ID]",
"keyVault": "[Key Vault ARM resource ID]",
"systemDatastoresAuthMode": "identity",
"managedNetwork":
{
"isolationMode": "Disabled"
},
"publicNetworkAccess": "Enabled"
}
}
]
}
For information on deploying an ARM template, use one of the following articles:
Assign roles to users
After updating the workspace, update the storage account to disable shared key access. For more information about disabling shared key access, visit the Prevent shared key authorization for an Azure Storage account article.
You must also identify all the users that need access to the default datastores - for example, Data Scientist. These users must be assigned the Storage Blob Data Contributor and Storage File Data Privileged Contributor roles in Azure role-based access control for the storage account. If these users only need read access, use the Storage Blob Data Reader and Storage File Data Privileged Reader roles instead. For more information, visit the role assignments resource in this document.
Revert to use shared keys
To revert a workspace back to use of shared keys to access the storage account, use this information:
To update an existing workspace, go to Properties and select Credential-based access.
Select Save to save this choice.
To configure the workspace to use a shared key again, set the system_datastores_auth_mode = "accesskey"
for the workspace. The following code demonstrates updating a workspace named test-ws1
:
ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group)
ws = ml_client.workspaces.get(name="test-ws1")
ws.system_datastores_auth_mode = "accesskey"
ws = ml_client.workspaces.begin_update(workspace=ws).result()
To configure the workspace to use a shared key again, use the az ml workspace update
command and specify --system-datastores-auth-mode accesskey
. The following example demonstrates updating a workspace named myworkspace
.
az ml workspace update --name myworkspace --system-datastores-auth-mode accesskey
If you have an existing Azure Machine Learning workspace, use the steps in this section to update the workspace to use Microsoft Entra ID, to authorize access to the storage account. Then, disable shared key access on the storage account.
In the following JSON template example, substitute your own values for the following placeholders:
- [workspace name]
- [workspace friendly name]
- [Storage Account ARM resource ID]
- [Key Vault ARM resource ID]
{
"$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"resources":
[
{
"type": "Microsoft.MachineLearningServices/workspaces",
"apiVersion": "2024-04-01",
"name": "[workspace name]",
"location": "[resourceGroup().location]",
"sku":
{
"name": "Basic",
"tier": "Basic"
},
"kind": "Default",
"identity":
{
"type": "SystemAssigned"
},
"properties":
{
"friendlyName": "[workspace friendly name]",
"storageAccount": "[Storage Account ARM resource ID]",
"keyVault": "[Key Vault ARM resource ID]",
"systemDatastoresAuthMode": "accesskey",
"managedNetwork":
{
"isolationMode": "Disabled"
},
"publicNetworkAccess": "Enabled"
}
}
]
}
For information on deploying an ARM template, use one of the following articles:
After you create the workspace, identify all the users that will use it - for example, Data Scientists. These users must be assigned the Storage Blob Data Contributor and Storage File Data Privileged Contributor roles in Azure role-based access control for the storage account. If these users only need read access, use the Storage Blob Data Reader and Storage File Data Privileged Reader roles instead. For more information, visit the role assignments resource in this document.
After reverting the workspace, update the storage account to disable shared key access. For more information about disabling shared key access, visit the Prevent shared key authorization for an Azure Storage account article.
Scenarios for role assignments
To work with a storage account with disabled shared key access, you might need to grant more roles to either your users or the managed identity for your hub. Hubs have a system-assigned managed identity by default. However, some scenarios require a user-assigned managed identity. This table summarizes the scenarios that require extra role assignments:
Scenario |
Microsoft Entra ID |
Required roles |
Notes |
Managed online endpoint |
System-assigned managed identity |
Storage Blob Data Contributor |
Automatically assigned the role when provisioned. Don't manually change this role assignment. |
Monitoring (evaluating model quality/perf) |
User-assigned managed identity |
Storage Blob Data Contributor |
If an existing user-assigned managed identity is presently used by the workspace, verify that it has an assigned Storage Data Blob Contributor role. The user-assigned managed identity is in addition to the system-assigned managed identity for your workspace. For information about how to add the managed identity to the workspace, visit Add a user-assigned managed identity.
|
Model Registry and ML Flow |
User-assigned managed identity |
Storage Blob Data Contributor |
Create compute cluster that uses the user-assigned identity. • In case of model as input/output for a job, separately create an UAMI, add "Storage Data Contributor" role to underlying storage, and associate that UAMI when creating Compute Cluster. The job will then successfully run • In case of registration of a model from local files, the user needs the "Storage Data Contributor" role for the underlying storage • Model package scenarios have known issues and are not supported at this time. |
Parallel Run Step (PRS) |
User-assigned managed identity |
Storage Table Data Contributor
Storage Queue Data Contributor |
|
Data Labeling |
User's identity |
Storage Blob Data Contributor |
|
Studio: create datasets, browse data |
User's identity |
Storage Blob Data Contributor |
|
Compute Instance |
User's identity |
Storage File Data Privileged Contributor |
|
Studio: notebooks |
User's identity |
Storage File Data Privileged Contributor |
|
Studio: notebook's file explorer |
User's identity |
Storage File Data Privileged Contributor |
|
PromptFlow |
User's identity |
Storage Blob Data Contributor Storage File Data Privileged Contributor |
|
Data: datastores and datasets |
User's identity |
Storage Blob Data Contributor |
|
Limitations
- Creating a compute instance with System-Assigned Managed Identity is NOT support for identity based workspace. If the workspace's storage account access type is identity-based access, compute instances currently doesn't support system assigned identity to mount data store, please use user assigned identity to create the compute instance, and make sure the user-assigned identity has Storage File Data Priviliiged Contributor on the storage account.
Related content