使用 Azure 数据工厂移动文件Move files with Azure Data Factory

适用于:是 Azure 数据工厂否 Azure Synapse Analytics(预览版)APPLIES TO: yesAzure Data Factory noAzure Synapse Analytics (Preview)

对于在存储之间复制二进制文件的“移动”场景,ADF 复制活动提供了内置支持。ADF copy activity has built-in support on �move� scenario when copying binary files between storage stores. 启用它的方法是在复制活动中将“deleteFilesAfterCompletion”设置为 true。The way to enable it is to set �deleteFilesAfterCompletion� as true in copy activity. 这样一来,复制活动就会在作业完成后从数据源存储中删除文件。By doing so, copy activity will delete files from data source store after job completion.

本文介绍了一种解决方案模板,它可作为另一种利用 ADF 灵活控制流以及复制活动和删除活动来实现相同场景的方法。This article describes a solution template as another approach leveraging ADF flexible control flow plus copy activity and delete activity to achieve the same scenario. 此模板的常见使用场景之一:不断地将文件放入源存储的登陆文件夹。One of the common scenarios of using this template: Files are continually dropped to a landing folder of your source store. 通过创建计划触发器,ADF 管道可以定期将这些文件从源存储移到目标存储。By creating a schedule trigger, ADF pipeline can periodically move those files from the source to the destination store. ADF 管道实现“移动文件”的方式是从登陆文件夹中获取文件,将每个文件复制到目标存储中的另一个文件夹,然后从源存储中的登陆文件夹删除相同的文件。The way that ADF pipeline achieves "moving files" is getting the files from the landing folder, copying each of them to another folder on the destination store and then deleting the same files from the landing folder on the source store.

备注

请注意,此模板旨在移动文件,而不是移动文件夹。Be aware that this template is designed to move files rather than moving folders. 如果通过以下方式移动文件夹,则需要十分谨慎:更改数据集,使其仅包含文件夹路径,然后使用复制活动和删除活动引用表示某个文件夹的同一数据集。If you want to move the folder by changing the dataset to make it contain a folder path only, and then using the copy activity and delete activity to reference to the same dataset representing a folder, you need to be very careful. 因为必须确保在复制操作和删除操作之间不会有新文件进入文件夹。It is because you have to make sure that there will NOT be new files arriving into the folder between copying operation and deleting operation. 如果在 Copy 活动刚完成复制作业,但 Delete 活动尚未开始时有新文件进入文件夹,则 Delete 活动可能将通过删除整个文件夹来删除尚未复制到目标的此新文件。If there are new files arriving at the folder at the moment when your copy activity just completed the copy job but the Delete activity has not been stared, it is possible that the Delete activity will delete this new arriving file which has NOT been copied to the destination yet by deleting the entire folder.

关于此解决方案模板About this solution template

此模板从基于源文件的存储中获取文件。This template gets the files from your source file-based store. 然后,它将每个文件移到目标存储。It then moves each of them to the destination store.

该模板包含五个活动:The template contains five activities:

  • GetMetadata 获取对象列表,其中包括源存储上的文件夹中的文件和子文件夹。GetMetadata gets the list of objects including the files and subfolders from your folder on source store. 它不会以递归方式检索对象。It will not retrieve the objects recursively.
  • Filter 筛选 GetMetadata 活动中的对象列表,以仅选择文件。Filter filter the objects list from GetMetadata activity to select the files only.
  • ForEach 获取 Filter 活动提供的文件列表,然后循环访问该列表并将每个文件传递到 Copy 活动和 Delete 活动。ForEach gets the file list from the Filter activity and then iterates over the list and passes each file to the Copy activity and Delete activity.
  • Copy 将源中的一个文件复制到目标存储。Copy copies one file from the source to the destination store.
  • Delete 从源存储中删除同一个文件。Delete deletes the same one file from the source store.

该模板定义了四个参数:The template defines four parameters:

  • ** SourceStore_Location 是要从中移动文件的源存储的文件夹路径。SourceStore_Location is the folder path of your source store where you want to move files from.
  • ** SourceStore_Directory 是要从中移动文件的源存储的子文件夹路径。SourceStore_Directory is the subfolder path of your source store where you want to move files from.
  • ** DestinationStore_Location 是要将文件移到的目标存储的文件夹路径。DestinationStore_Location is the folder path of your destination store where you want to move files to.
  • ** DestinationStore_Directory 是要将文件移到的目标存储的子文件夹路径。DestinationStore_Directory is the subfolder path of your destination store where you want to move files to.

如何使用此解决方案模板How to use this solution template

  1. 转到“移动文件”模板。****Go to the Move files template. 选择现有的连接,或者与要从中移动文件的源文件存储建立的连接。Select existing connection or create a New connection to your source file store where you want to move files from. 请注意,DataSource_FolderDataSource_File 是对源文件存储的相同连接的引用。Be aware that DataSource_Folder and DataSource_File are reference to the same connection of your source file store.

    与源建立新的连接

  2. 选择现有的连接,或者与要将文件移到的目标文件存储建立的连接。Select existing connection or create a New connection to your destination file store where you want to move files to.

    与目标建立新的连接

  3. 选择“使用此模板”**** 选项卡。Select Use this template tab.

  4. 你将看到管道,如以下示例所示:You'll see the pipeline, as in the following example:

    显示管道

  5. 选择“调试”,输入“参数”,然后选择“完成”。Select Debug, enter the Parameters, and then select Finish. 参数是要从中移动文件的文件夹路径,以及要将文件移到的文件夹路径。The parameters are the folder path where you want to move files from and the folder path where you want to move files to.

    运行管道

  6. 查看结果。Review the result.

    查看结果

后续步骤Next steps