导出或删除机器学习服务工作区数据Export or delete your Machine Learning service workspace data

适用于:是基本版是企业版               (升级到企业版APPLIES TO: yesBasic edition yesEnterprise edition                    (Upgrade to Enterprise edition)

在 Azure 机器学习中,可以使用门户的图形界面或 Python SDK 来导出或删除工作区数据。In Azure Machine Learning, you can export or delete your workspace data using either the portal's graphical interface or the Python SDK. 本文介绍这两种选项。This article describes both options.

备注

如果有兴趣查看或删除个人数据,请参阅 GDPR 的 Azure 数据使用者请求一文。If you’re interested in viewing or deleting personal data, please see the Azure Data Subject Requests for the GDPR article. 如需关于 GDPR 的常规信息,请参阅服务信任门户的 GDPR 部分If you’re looking for general info about GDPR, see the GDPR section of the Service Trust portal.

备注

本文介绍如何删除设备或服务中的个人数据,并且可为 GDPR 下的任务提供支持。This article provides steps for how to delete personal data from the device or service and can be used to support your obligations under the GDPR. 如需关于 GDPR 的常规信息,请参阅服务信任门户的 GDPR 部分If you're looking for general info about GDPR, see the GDPR section of the Service Trust portal.

控制工作区数据Control your workspace data

Azure 机器学习存储的产品内数据可用于导出和删除。In-product data stored by Azure Machine Learning is available for export and deletion. 可以使用 Azure 机器学习工作室、CLI 和 SDK 进行导出和删除。You can export and delete using Azure Machine Learning studio, CLI, and SDK. 可通过 Azure 隐私门户访问遥测数据。Telemetry data can be accessed through the Azure Privacy portal.

在 Azure 机器学习中,个人数据包括运行历史记录文档中的用户信息。In Azure Machine Learning, personal data consists of user information in run history documents.

使用门户删除高级资源Delete high-level resources using the portal

创建工作区时,Azure 在资源组中创建大量资源:When you create a workspace, Azure creates a number of resources within the resource group:

  • 工作区本身The workspace itself
  • 一个存储帐户A storage account
  • 容器注册表A container registry
  • Application Insights 实例An Applications Insights instance
  • 密钥保管库A key vault

可以从列表选择这些资源,然后选择“删除”将它们删除These resources can be deleted by selecting them from the list and choosing Delete

突出显示了“删除”图标的门户的屏幕截图

运行历史记录文档(其中可能包含个人用户信息)存储在 Blob 存储的存储帐户的 /azureml 的子文件夹。Run history documents, which may contain personal user information, are stored in the storage account in blob storage, in subfolders of /azureml. 可以从门户下载并删除数据。You can download and delete the data from the portal.

门户中存储帐户内 azureml 目录的屏幕截图

使用 Azure 机器学习工作室导出和删除机器学习资源Export and delete machine learning resources using Azure Machine Learning studio

Azure 机器学习工作室提供机器学习资源(如笔记本、数据集、模型和试验)的统一视图。Azure Machine Learning studio provides a unified view of your machine learning resources, such as notebooks, datasets, models, and experiments. Azure 机器学习工作室强调保存数据和试验的记录。Azure Machine Learning studio emphasizes preserving a record of your data and experiments. 可以使用浏览器删除计算性资源(如管道和计算资源)。Computational resources such as pipelines and compute resources can be deleted using the browser. 对于这些资源,导航到相关资源并选择“删除”。For these resources, navigate to the resource in question and choose Delete.

可以取消注册数据集,并且可以存档试验,但这些操作不删除数据。Datasets can be unregistered and Experiments can be archived, but these operations don't delete the data. 若要完全删除数据,必须在存储级别删除数据集和运行数据。To entirely remove the data, datasets and run data must be deleted at the storage level. 如前文所述,使用门户在存储级别完成删除。Deleting at the storage level is done using the portal, as described previously.

可以使用工作室从实验性运行下载训练项目。You can download training artifacts from experimental runs using the Studio. 选择“试验”,然后选择感兴趣的“运行”。Choose the Experiment and Run in which you are interested. 选择“输出 + 日志”并导航到要下载的特定项目。Choose Output + logs and navigate to the specific artifacts you wish to download. 选择“...”和“下载”。Choose ... and Download.

可以通过导航到所需“模型”并选择“下载”来下载已注册的模型。You can download a registered model by navigating to the desired Model and choosing Download.

突出显示了“下载”选项的工作室模型页的屏幕截图

使用 Python SDK 导出和删除资源Export and delete resources using the Python SDK

可以使用以下内容来下载特定运行的输出:You can download the outputs of a particular run using:

# Retrieved from Azure Machine Learning web UI
run_id = 'aaaaaaaa-bbbb-cccc-dddd-0123456789AB'
experiment = ws.experiments['my-experiment']
run = next(run for run in ex.get_runs() if run.id == run_id)
metrics_output_port = run.get_pipeline_output('metrics_output')
model_output_port = run.get_pipeline_output('model_output')

metrics_output_port.download('.', show_progress=True)
model_output_port.download('.', show_progress=True)

可以使用 Python SDK 删除以下机器学习资源:The following machine learning resources can be deleted using the Python SDK:

类型Type 函数调用Function Call 说明Notes
Workspace delete 使用 delete-dependent-resources 来级联删除Use delete-dependent-resources to cascade the delete
Model delete
ComputeTarget delete
WebService delete