Azure 数据工厂中的“获取元数据”活动Get Metadata activity in Azure Data Factory

适用于: Azure 数据工厂 Azure Synapse Analytics

可以使用“获取元数据”活动来检索 Azure 数据工厂中任何数据的元数据。You can use the Get Metadata activity to retrieve the metadata of any data in Azure Data Factory. 可以在条件表达式中使用“获取元数据”活动的输出来执行验证,或使用后续活动中的元数据。You can use the output from the Get Metadata activity in conditional expressions to perform validation, or consume the metadata in subsequent activities.

支持的功能Supported capabilities

“获取元数据”活动采用数据集作为输入,并返回元数据信息作为输出。The Get Metadata activity takes a dataset as an input and returns metadata information as output. 目前支持以下连接器以及对应的可检索元数据。Currently, the following connectors and the corresponding retrievable metadata are supported. 返回的元数据的最大大小为 4 MB。The maximum size of returned metadata is 4 MB.

受支持的连接器Supported connectors

文件存储File storage

连接器/元数据Connector/Metadata itemNameitemName
Amazon S3Amazon S3 √/√√/√ √/√√/√ x/xx/x √/√√/√ xx √/√√/√
Google Cloud StorageGoogle Cloud Storage √/√√/√ √/√√/√ x/xx/x √/√√/√ xx √/√√/√
Azure Blob 存储Azure Blob storage √/√√/√ √/√√/√ x/xx/x √/√√/√ √/√√/√
Azure Data Lake Storage Gen2Azure Data Lake Storage Gen2 √/√√/√ √/√√/√ x/xx/x √/√√/√ √/√√/√
Azure 文件Azure Files √/√√/√ √/√√/√ √/√√/√ √/√√/√ xx √/√√/√
文件系统File system √/√√/√ √/√√/√ √/√√/√ √/√√/√ xx √/√√/√
SFTPSFTP √/√√/√ √/√√/√ x/xx/x √/√√/√ xx √/√√/√
FTPFTP √/√√/√ √/√√/√ x/xx/x x/xx/x xx √/√√/√

1 元数据 lastModified1 Metadata lastModified:

  • 对于 Amazon S3 和 Google 云存储,lastModified 适用于桶和键,但不适用于虚拟文件夹;而 exists 适用于桶和键,但不适用于前缀或虚拟文件夹。For Amazon S3 and Google Cloud Storage, lastModified applies to the bucket and the key but not to the virtual folder, and exists applies to the bucket and the key but not to the prefix or virtual folder.
  • 对于 Azure Blob 存储,lastModified 适用于容器和 Blob,但不适用于虚拟文件夹。For Azure Blob storage, lastModified applies to the container and the blob but not to the virtual folder.

2 从二进制文件、JSON 文件或 XML 文件获取元数据时,不支持元数据 structurecolumnCount2 Metadata structure and columnCount are not supported when getting metadata from Binary, JSON, or XML files.

3 元数据 exists:对于 Amazon S3 和 Google 云存储,exists 适用于桶和键,但不适用于前缀或虚拟文件夹。3 Metadata exists: For Amazon S3 and Google Cloud Storage, exists applies to the bucket and the key but not to the prefix or virtual folder.

注意以下事项:Note the following:

  • 对文件夹使用“获取元数据”活动时,请确保对给定文件夹具有“列出/执行”权限。When using Get Metadata activity against a folder, make sure you have LIST/EXECUTE permission to the given folder.

  • “获取元数据”活动不支持文件夹/文件的通配符筛选器。Wildcard filter on folders/files is not supported for Get Metadata activity.

  • 连接器上的 modifiedDatetimeStartmodifiedDatetimeEnd 筛选器集:modifiedDatetimeStart and modifiedDatetimeEnd filter set on connector:

    • 这两个属性用于在从文件夹中获取元数据时筛选子项。These two properties are used to filter the child items when getting metadata from a folder. 它不适用于从文件中获取元数据。It does not apply when getting metadata from a file.
    • 使用此类筛选器时,输出中的 childItems 只包含在指定范围内修改的文件,但不包含文件夹。When such filter is used, the childItems in output includes only the files that are modified within the specified range but not folders.
    • 若要应用此类筛选器,GetMetadata 活动将枚举指定文件夹中的所有文件,并检查修改时间。To apply such filter, GetMetadata activity will enumerate all the files in the specified folder and check the modified time. 即使所需的限定文件计数较小,也应避免指向包含大量文件的文件夹。Avoid pointing to a folder with a large number of files even if the expected qualified file count is small.

关系数据库Relational database

连接器/元数据Connector/Metadata structurestructure columnCountcolumnCount existsexists
Azure SQL 数据库Azure SQL Database
Azure SQL 托管实例Azure SQL Managed Instance
Azure Synapse AnalyticsAzure Synapse Analytics
SQL ServerSQL Server

元数据选项Metadata options

可以在“获取元数据”活动字段列表中指定以下元数据类型,以检索相应的信息:You can specify the following metadata types in the Get Metadata activity field list to retrieve the corresponding information:

元数据类型Metadata type 说明Description
itemNameitemName 文件或文件夹的名称。Name of the file or folder.
itemTypeitemType 文件或文件夹的类型。Type of the file or folder. 返回的值为 FileFolderReturned value is File or Folder.
大小size 文件大小,以字节为单位。Size of the file, in bytes. 仅适用于文件。Applicable only to files.
createdcreated 文件或文件夹的创建日期时间。Created datetime of the file or folder.
lastModifiedlastModified 文件或文件夹的上次修改日期时间。Last modified datetime of the file or folder.
childItemschildItems 给定文件夹中的子文件夹和文件列表。List of subfolders and files in the given folder. 仅适用于文件夹。Applicable only to folders. 返回的值为每个子项的名称和类型列表。Returned value is a list of the name and type of each child item.
contentMD5contentMD5 文件的 MD5。MD5 of the file. 仅适用于文件。Applicable only to files.
structurestructure 文件或关系数据库表的数据结构。Data structure of the file or relational database table. 返回的值为列名称和列类型列表。Returned value is a list of column names and column types.
columnCountcolumnCount 文件或关系表中的列数。Number of columns in the file or relational table.
existsexists 是否存在某个文件、文件夹或表。Whether a file, folder, or table exists. 如果在“获取元数据”字段列表中指定了 exists,那么,即使不存在该文件、文件夹或表,该活动也不会失败,If exists is specified in the Get Metadata field list, the activity won't fail even if the file, folder, or table doesn't exist. 而是在输出中返回 exists: falseInstead, exists: false is returned in the output.


若要验证是否存在某个文件、文件夹或表,请在“获取元数据”活动字段列表中指定 existsWhen you want to validate that a file, folder, or table exists, specify exists in the Get Metadata activity field list. 然后可以检查活动输出中的 exists: true/false 结果。You can then check the exists: true/false result in the activity output. 如果未在该字段列表中指定 exists,那么,在找不到对象时,“获取元数据”活动将会失败。If exists isn't specified in the field list, the Get Metadata activity will fail if the object isn't found.


从文件存储获取元数据以及配置 modifiedDatetimeStartmodifiedDatetimeEnd 时,输出中的 childItems 只包含指定路径中最近修改时间在指定范围内的文件。When you get metadata from file stores and configure modifiedDatetimeStart or modifiedDatetimeEnd, the childItems in the output includes only files in the specified path that have a last modified time within the specified range. 子文件夹中的项不包括在内。Items in subfolders are not included.


要使“结构”字段列表为分隔文本和 Excel 格式数据集提供实际数据结构,必须启用 First Row as Header 属性,该属性仅支持用于这些数据源。For the Structure field list to provide the actual data structure for delimited text and Excel format datasets, you must enable the First Row as Header property, which is supported only for these data sources.


获取元数据活动Get Metadata activity







Type 属性Type properties

目前,“获取元数据”活动可以返回以下类型的元数据信息:Currently, the Get Metadata activity can return the following types of metadata information:

属性Property 说明Description 必须Required
fieldListfieldList 所需元数据信息的类型。The types of metadata information required. 有关支持的元数据的详细信息,请参阅本文的元数据选项部分。For details on supported metadata, see the Metadata options section of this article. Yes
datasetdataset 引用数据集,其元数据将由“获取元数据”活动检索。The reference dataset whose metadata is to be retrieved by the Get Metadata activity. 有关支持的连接器的信息,请参阅功能部分。See the Capabilities section for information on supported connectors. 有关数据集语法详细信息,请参阅特定的连接器主题。Refer to the specific connector topics for dataset syntax details. Yes
formatSettingsformatSettings 使用格式类型数据集时适用。Apply when using format type dataset. No
storeSettingsstoreSettings 使用格式类型数据集时适用。Apply when using format type dataset. No

示例输出Sample output

“获取元数据”的结果显示在活动输出中。The Get Metadata results are shown in the activity output. 以下两个示例演示了大量的元数据选项。Following are two samples showing extensive metadata options. 若要在后续活动中使用这些结果,请使用以下模式:@{activity('MyGetMetadataActivity').output.itemName}To use the results in a subsequent activity, use this pattern: @{activity('MyGetMetadataActivity').output.itemName}.

获取文件的元数据Get a file's metadata

  "exists": true,
  "itemName": "test.csv",
  "itemType": "File",
  "size": 104857600,
  "lastModified": "2017-02-23T06:17:09Z",
  "created": "2017-02-23T06:17:09Z",
  "contentMD5": "cMauY+Kz5zDm3eWa9VpoyQ==",
  "structure": [
        "name": "id",
        "type": "Int64"
        "name": "name",
        "type": "String"
  "columnCount": 2

获取文件夹的元数据Get a folder's metadata

  "exists": true,
  "itemName": "testFolder",
  "itemType": "Folder",
  "lastModified": "2017-02-23T06:17:09Z",
  "created": "2017-02-23T06:17:09Z",
  "childItems": [
      "name": "test.avro",
      "type": "File"
      "name": "folder hello",
      "type": "Folder"

后续步骤Next steps

了解数据工厂支持的其他控制流活动:Learn about other control flow activities supported by Data Factory: