externaldata 运算符externaldata operator

externaldata 运算符返回一个表,该表的架构是在查询自身中定义的,并且该表的数据是从外部存储项目(如 Azure Blob 存储中的 Blob)中读取的。The externaldata operator returns a table whose schema is defined in the query itself, and whose data is read from an external storage artifact, such as a blob in Azure Blob Storage.

语法Syntax

externaldata ( ColumnName : ColumnType [, ...] ) [ StorageConnectionString ] [with ( Prop1 = Value1 [, ...] )]externaldata ( ColumnName : ColumnType [, ...] ) [ StorageConnectionString ] [with ( Prop1 = Value1 [, ...] )]

参数Arguments

  • ColumnName, ColumnType:这些参数定义表的架构。ColumnName, ColumnType: The arguments define the schema of the table. 该语法与定义 .create table 中的表时所使用的语法相同。The syntax is the same as the syntax used when defining a table in .create table.

  • StorageConnectionString:存储连接字符串描述包含要返回的数据的存储项目。StorageConnectionString: The storage connection string describes the storage artifact holding the data to return.

  • Prop1, Value1, ...:其他属性(如引入属性下面所列),描述如何解释从存储中检索到的数据。Prop1, Value1, ...: Additional properties that describe how to interpret the data retrieved from storage, as listed under ingestion properties.

    • 当前支持的属性:formatignoreFirstRecordCurrently supported properties: format and ignoreFirstRecord.
    • 支持的数据格式:支持任何引入数据格式,包括 CSVTSVJSONParquetAvroSupported data formats: Any of the ingestion data formats are supported, including CSV, TSV, JSON, Parquet, Avro.

备注

此运算符没有管道输入。This operator does not have a pipeline input.

返回Returns

externaldata 运算符返回给定架构的数据表,表中的数据是从指定的存储项目中分析的,由存储连接字符串指示。The externaldata operator returns a data table of the given schema whose data was parsed from the specified storage artifact, indicated by the storage connection string.

示例Examples

下面的示例显示了如何查找表中的所有记录,该表的 UserID 列属于一个已知 ID 集,这些 ID 保存在外部 Blob 中(每行一个 ID)。The following example shows how to find all records in a table whose UserID column falls into a known set of IDs, held (one per line) in an external blob. 由于该集由查询间接引用,因此它可能会很大。Because the set is indirectly referenced by the query, it can be large.

Users
| where UserID in ((externaldata (UserID:string) [
    @"https://storageaccount.blob.core.chinacloudapi.cn/storagecontainer/users.txt"
      h@"?...SAS..." // Secret token needed to access the blob
    ]))
| ...

下面的示例查询外部存储中存储的多个数据文件。The following example queries multiple data files stored in external storage.

externaldata(Timestamp:datetime, ProductId:string, ProductDescription:string)
[
  h@"https://mycompanystorage.blob.core.chinacloudapi.cn/archivedproducts/2019/01/01/part-00000-7e967c99-cf2b-4dbb-8c53-ce388389470d.csv.gz?...SAS...",
  h@"https://mycompanystorage.blob.core.chinacloudapi.cn/archivedproducts/2019/01/02/part-00000-ba356fa4-f85f-430a-8b5a-afd64f128ca4.csv.gz?...SAS...",
  h@"https://mycompanystorage.blob.core.chinacloudapi.cn/archivedproducts/2019/01/03/part-00000-acb644dc-2fc6-467c-ab80-d1590b23fc31.csv.gz?...SAS..."
]
with(format="csv")
| summarize count() by ProductId

可将上述示例视为快速查询多个数据文件(无需定义外部表)的方法。The above example can be thought of as a quick way to query multiple data files without defining an external table.

备注

externaldata() 运算符无法识别数据分区。Data partitioning isn't recognized by the externaldata() operator.