导出数据模块Export Data module

本文介绍 Azure 机器学习设计器(预览版)中的一个模块。This article describes a module in Azure Machine Learning designer (preview).

使用此模块将管道中的结果、中间数据和工作数据保存到云存储目标中。Use this module to save results, intermediate data, and working data from your pipelines into cloud storage destinations.

此模块支持将数据导出到以下云数据服务中:This module supports exporting your data to the following cloud data services:

  • Azure Blob 容器Azure Blob Container
  • Azure 文件共享Azure File Share
  • Azure Data Lake Storage Gen2Azure Data Lake Storage Gen2
  • Azure SQL 数据库Azure SQL database

导出数据之前,首先需要在 Azure 机器学习工作区中注册数据存储。Before exporting your data, you need to first register a datastore in your Azure Machine Learning workspace. 有关详细信息,请参阅访问 Azure 存储服务中的数据For more information, see Access data in Azure storage services.

如何配置“导出数据”How to configure Export Data

  1. 在设计器中将“导出数据”模块添加到管道 。Add the Export Data module to your pipeline in the designer. 可以在“输入和输出”类别中找到该模块 。You can find this module in the Input and Output category.

  2. 将“导出数据”连接到包含要导出的数据的模块 。Connect Export Data to the module that contains the data you want to export.

  3. 选择“导出数据”,打开“属性”窗格 。Select Export Data to open the Properties pane.

  4. 对于“数据存储”,请从下拉列表中选择现有数据存储 。For Datastore, select an existing datastore from the dropdown list. 还可以创建新的数据存储。You can also create a new datastore. 通过访问访问 Azure 存储服务中的数据来查看方法。Check how by visiting Access data in Azure storage services.

    备注

    不支持将某种数据类型的数据导出到指定为另一种数据类型的 SQL 数据库列。Exporting data of a certain data type to a SQL database column specified as another data type is not supported. 目标表不需要先存在。The target table does not need to exist first.

  5. 复选框“重新生成输出”决定是否在运行时执行模块以重新生成输出。The checkbox, Regenerate output, decides whether to execute the module to regenerate output at running time.

    默认情况下,它处于未选中状态,这意味着如果以前已经以相同的参数执行了模块,系统将重用上次运行的输出以减少运行时间。It's by default unselected, which means if the module has been executed with the same parameters previously, the system will reuse the output from last run to reduce run time.

    如果选中了此复选框,系统将再次执行模块以再生成输出。If it is selected, the system will execute the module again to regenerate output.

  6. 在数据存储中定义数据所在的路径。Define the path in the datastore where the data is. 此路径是相对路径。The path is a relative path. 不允许使用空路径或 URL 路径。The empty paths or a URL paths are not allowed.

  7. 对于“文件格式”,请选择数据的存储格式。For File format, select the format in which data should be stored.

  8. 提交管道。Submit the pipeline.

后续步骤Next steps

请参阅 Azure 机器学习的可用模块集See the set of modules available to Azure Machine Learning.