.ingest into 命令(从存储中拉取数据)The .ingest into command (pull data from storage)

.ingest into 命令通过从一个或多个云存储项目“拉取”数据,将数据引入表中。The .ingest into command ingests data into a table by "pulling" the data from one or more cloud storage artifacts. 例如,此命令可以从 Azure Blob 存储中检索 1000 个 CSV 格式的 blob,对其进行分析,然后将它们一起引入到单个目标表中。For example, the command can retrieve 1000 CSV-formatted blobs from Azure Blob Storage, parse them, and ingest them together into a single target table. 数据将追加到表,不会影响现有记录,也不会修改表的架构。Data is appended to the table without affecting existing records, and without modifying the table's schema.

语法Syntax

.ingest [async] into table TableName SourceDataLocator [with ( IngestionPropertyName = IngestionPropertyValue [, ...] )].ingest [async] into table TableName SourceDataLocator [with ( IngestionPropertyName = IngestionPropertyValue [, ...] )]

参数Arguments

  • async:如果指定了此项,命令会立即返回,并继续在后台执行引入操作。async: If specified, the command will return immediately, and continue ingestion in the background. 命令的结果将包含一个 OperationId 值,该值随后可以与 .show operation 命令一起使用,以检索引入的完成状态和结果。The results of the command will include an OperationId value that can then be used with the .show operation command to retrieve the ingestion completion status and results.

  • TableName:要将数据引入到其中的表的名称。TableName: The name of the table to ingest data to. 表名在上下文中始终相对于数据库,如果未提供架构映射对象,则其架构是将为数据采用的架构。The table name is always relative to the database in context, and its schema is the schema that will be assumed for the data if no schema mapping object is provided.

  • SourceDataLocator:string 类型的文本,或由 () 字符括起来的以逗号分隔的此类文本列表,表示包含要拉取的数据的存储项目。SourceDataLocator: A literal of type string, or a comma-delimited list of such literals surrounded by ( and ) characters, indicating the storage artifacts containing the data to pull. 请参阅存储连接字符串See storage connection strings.

备注

对于包含实际凭据的 SourceDataPointer,强烈建议为其使用经过模糊处理的字符串文本It is strongly recommended to use obfuscated string literals for the SourceDataPointer that includes actual credentials in it. 该服务将确保清除其内部跟踪、错误消息和其他位置中的凭据。The service will be sure to scrub credentials in its internal traces, error messages, etc.

  • IngestionPropertyName、IngestionPropertyValue:影响引入过程的任意数量的引入属性IngestionPropertyName, IngestionPropertyValue: Any number of ingestion properties that affect the ingestion process.

结果Results

命令的结果是一个表,其中包含的记录与命令生成的数据分片(“盘区”)的数量一样多。The result of the command is a table with as many records as there are data shards ("extents") generated by the command. 如果未生成任何数据分片,则返回一条带有空(零值)盘区 ID 的记录。If no data shards have been generated, a single record is returned with an empty (zero-valued) extent ID.

名称Name 类型Type 说明Description
ExtentIdExtentId guid 该命令生成的数据分片的唯一标识符。The unique identifier for the data shard that was generated by the command.
ItemLoadedItemLoaded string 与此记录相关的一个或多个存储项目。One or more storage artifacts that are related to this record.
持续时间Duration timespan 执行引入所花费的时间。How long it took to perform ingestion.
HasErrorsHasErrors bool 此记录是否表示引入失败。Whether this record represents an ingestion failure or not.
OperationIdOperationId guid 表示操作的唯一 ID。A unique ID representing the operation. 可以与 .show operation 命令一起使用。Can be used with the .show operation command.

备注Remarks

此命令不会修改要引入到其中的表的架构。This command does not modify the schema of the table being ingested into. 必要时,数据在引入期间会被“强制纳入”到此架构中,而不采用其他方法(忽略多余的列,将缺少的列视为 null 值)。If necessary, the data is "coerced" into this schema during ingestion, not the other way around (extra columns are ignored, and missing columns are treated as null values).

示例Examples

下一个示例指示引擎将 Azure Blob 存储中的两个 blob 读取为 CSV 文件,并将其内容引入到表 T 中。The next example instructs the engine to read two blobs from Azure Blob Storage as CSV files, and ingest their contents into table T. ... 表示 Azure 存储共享访问签名 (SAS),该签名提供对每个 blob 的读取访问权限。The ... represents an Azure Storage shared access signature (SAS) which gives read access to each blob. 另请注意,使用了经过模糊处理的字符串(字符串值前面的 h),以确保从不记录 SAS。Note also the use of obfuscated strings (the h in front of the string values) to ensure that the SAS is never recorded.

.ingest into table T (
    h'https://contoso.blob.core.chinacloudapi.cn/container/file1.csv?...',
    h'https://contoso.blob.core.chinacloudapi.cn/container/file2.csv?...'
)

下一个示例用于从 Azure Data Lake Storage Gen 2 (ADLSv2) 引入数据。The next example is for ingesting data from Azure Data Lake Storage Gen 2 (ADLSv2). 此处使用的凭据 (...) 是存储帐户凭据(共享密钥),我们只对连接字符串的机密部分使用字符串模糊处理。The credentials used here (...) are the storage account credentials (shared key), and we use string obfuscation only for the secret part of the connection string.

.ingest into table T (
  'abfss://myfilesystem@contoso.dfs.core.chinacloudapi.cn/path/to/file1.csv;...'
)

下一个示例从 Azure Data Lake Storage (ADLS) 引入单个文件。The next example ingests a single file from Azure Data Lake Storage (ADLS). 它使用用户的凭据来访问 ADLS(因此,无需将存储 URI 视为包含机密)。It uses the user's credentials to access ADLS (so there's no need to treat the storage URI as containing a secret). 它还展示了如何指定引入属性。It also shows how to specify ingestion properties.

.ingest into table T ('adl://contoso.azuredatalakestore.net/Path/To/File/file1.ext;impersonate')
  with (format='csv')