将数据存储管理升级到 SDK v2
Azure 机器学习数据存储将连接信息安全地存储在 Azure 上的数据存储中,因此无需在脚本中对其进行编码。 与 V1 相比,V2 数据存储概念大部分保持不变。 区别在于,我们不会通过 AzureML 数据存储支持类似 SQL 的数据源。 我们将通过 AzureML 数据导入和导出功能支持类似 SQL 的数据源。
本文比较 SDK v1 和 SDK v2 中的方案。
通过 account_key 从 Azure Blob 容器创建数据存储
SDK v1
blob_datastore_name='azblobsdk' # Name of the datastore to workspace container_name=os.getenv("BLOB_CONTAINER", "<my-container-name>") # Name of Azure blob container account_name=os.getenv("BLOB_ACCOUNTNAME", "<my-account-name>") # Storage account name account_key=os.getenv("BLOB_ACCOUNT_KEY", "<my-account-key>") # Storage account access key blob_datastore = Datastore.register_azure_blob_container(workspace=ws, datastore_name=blob_datastore_name, container_name=container_name, account_name=account_name, account_key=account_key)
SDK v2
from azure.ai.ml.entities import AzureBlobDatastore from azure.ai.ml import MLClient ml_client = MLClient.from_config() store = AzureBlobDatastore( name="blob-protocol-example", description="Datastore pointing to a blob container using wasbs protocol.", account_name="mytestblobstore", container_name="data-container", protocol="wasbs", credentials={ "account_key": "XXXxxxXXXxXXXXxxXXXXXxXXXXXxXxxXxXXXxXXXxXXxxxXXxxXXXxXxXXXxxXxxXXXXxxxxxXXxxxxxxXXXxXXX" }, ) ml_client.create_or_update(store)
通过 sas_token 从 Azure Blob 容器创建数据存储
SDK v1
blob_datastore_name='azblobsdk' # Name of the datastore to workspace container_name=os.getenv("BLOB_CONTAINER", "<my-container-name>") # Name of Azure blob container sas_token=os.getenv("BLOB_SAS_TOKEN", "<my-sas-token>") # Sas token blob_datastore = Datastore.register_azure_blob_container(workspace=ws, datastore_name=blob_datastore_name, container_name=container_name, sas_token=sas_token)
SDK v2
from azure.ai.ml.entities import AzureBlobDatastore from azure.ai.ml import MLClient ml_client = MLClient.from_config() store = AzureBlobDatastore( name="blob-sas-example", description="Datastore pointing to a blob container using SAS token.", account_name="mytestblobstore", container_name="data-container", credentials=SasTokenCredentials( sas_token= "?xx=XXXX-XX-XX&xx=xxxx&xxx=xxx&xx=xxxxxxxxxxx&xx=XXXX-XX-XXXXX:XX:XXX&xx=XXXX-XX-XXXXX:XX:XXX&xxx=xxxxx&xxx=XXxXXXxxxxxXXXXXXXxXxxxXXXXXxxXXXXXxXXXXxXXXxXXxXX" ), ) ml_client.create_or_update(store)
通过基于标识的身份验证从 Azure Blob 容器创建数据存储
- SDK v1
blob_datastore = Datastore.register_azure_blob_container(workspace=ws,
datastore_name='credentialless_blob',
container_name='my_container_name',
account_name='my_account_name')
SDK v2
from azure.ai.ml.entities import AzureBlobDatastore from azure.ai.ml import MLClient ml_client = MLClient.from_config() store = AzureBlobDatastore( name="", description="", account_name="", container_name="" ) ml_client.create_or_update(store)
从工作区获取数据存储
SDK v1
# Get a named datastore from the current workspace datastore = Datastore.get(ws, datastore_name='your datastore name')
# List all datastores registered in the current workspace datastores = ws.datastores for name, datastore in datastores.items(): print(name, datastore.datastore_type)
SDK v2
from azure.ai.ml import MLClient from azure.identity import DefaultAzureCredential #Enter details of your AzureML workspace subscription_id = '<SUBSCRIPTION_ID>' resource_group = '<RESOURCE_GROUP>' workspace_name = '<AZUREML_WORKSPACE_NAME>' ml_client = MLClient(credential=DefaultAzureCredential(), subscription_id=subscription_id, resource_group_name=resource_group) datastore = ml_client.datastores.get(datastore_name='your datastore name')
SDK v1 和 SDK v2 中关键功能的映射
SDK v1 中的存储类型 | SDK v2 中的存储类型 |
---|---|
azureml_blob_datastore | azureml_blob_datastore |
azureml_data_lake_gen2_datastore | azureml_data_lake_gen2_datastore |
azuremlml_sql_database_datastore | 将通过导入和导出功能获得支持 |
azuremlml_my_sql_datastore | 将通过导入和导出功能获得支持 |
azuremlml_postgre_sql_datastore | 将通过导入和导出功能获得支持 |
后续步骤
有关详细信息,请参阅: