Kusto 引入客户端库Kusto ingest client library

Kusto.Ingest 库是用于将数据发送到 Kusto 服务的 .NET 4.6.2 库。Kusto.Ingest library is a .NET 4.6.2 library for sending data to the Kusto service. 它依赖于以下库和 SDK:It takes dependencies on the following libraries and SDKs:

  • 用于 Azure AD 身份验证的 ADALADAL for Azure AD authentication
  • Azure 存储客户端Azure storage client

引入方法通过 IKustoIngestClient 接口定义。The ingestion methods are defined by the IKustoIngestClient interface. 这些方法可以同时采用同步和异步模式处理来自流、IDataReader、本地文件和 Azure blob 的数据引入。The methods handle data ingestion from Stream, IDataReader, local files, and Azure blobs in both synchronous and asynchronous modes.

引入客户端风格Ingest client flavors

引入客户端有两种基本风格:排队引入和直接引入。There are two basic flavors of the Ingest client: Queued and Direct.

排队引入Queued ingestion

排队引入模式由 IKustoQueuedIngestClient 定义,它限制了客户端代码对 Kusto 服务的依赖性。The Queued ingestion mode, defined by IKustoQueuedIngestClient, limits the client code dependency on the Kusto service. 通过将 Kusto 引入消息发布到 Azure 队列来完成引入,然后从 Kusto 数据管理(引入)服务获取该消息。Ingestion is done by posting a Kusto ingestion message to an Azure queue, which is then acquired from the Kusto Data Management (Ingestion) service. 任何中间存储项都将由引入客户端使用来自 Kusto 数据管理服务的资源创建。Any intermediate storage items will be created by the ingest client using the resources from the Kusto Data Management service.

排队模式的优点包括:Advantages of the queued mode include:

  • 将数据引入过程与 Kusto 引擎服务分离Decouples the data ingestion process from the Kusto Engine service
  • 能够在 Kusto 引擎(或引入)服务不可用时持久保存引入请求Lets ingestion requests to be persisted when the Kusto Engine (or Ingestion) service is unavailable
  • 借助引入服务对入站数据进行高效且可控的聚合,从而提高性能Improves performance by efficient and controllable aggregation of inbound data by the Ingestion service
  • 允许 Kusto 引入服务管理 Kusto 引擎服务上的引入负载Lets the Kusto Ingestion service manage the ingestion load on the Kusto Engine service
  • 根据需要,在出现暂时性引入失败(例如因为 XStore 限制而导致的失败)时重试 Kusto 引入服务Retries the Kusto Ingestion service, as needed, on transient ingestion failures, such as for XStore throttling
  • 提供了一种方便的机制来跟踪每个引入请求的进度和结果Provides a convenient mechanism to track the progress and outcome of every ingestion request

下面的关系图概述了排队引入客户端与 Kusto 的交互:The following diagram outlines the Queued ingestion client interaction with Kusto:

queued-ingest

直接引入Direct ingestion

由 IKustoDirectIngestClient 定义的直接引入模式强制与 Kusto 引擎服务进行直接交互。The Direct ingestion mode, defined by IKustoDirectIngestClient, forces direct interaction with the Kusto Engine service. 在此模式下,Kusto 引入服务不会限制或管理数据。In this mode, the Kusto Ingestion service doesn't moderate or manage the data. 每个引入请求最终都转换为直接在 Kusto 引擎服务上执行的 .ingest 命令。Every ingestion request is eventually translated into the .ingest command that is executed directly on the Kusto Engine service.

下面的关系图概述了直接引入客户端与 Kusto 的交互:The following diagram outlines the Direct ingestion client interaction with Kusto:

direct-ingest

备注

建议不要将直接模式用于生产级引入解决方案。The Direct mode isn't recommended for production grade ingestion solutions.

直接模式的优点包括:Advantages of the Direct mode include:

  • 低延迟且无聚合。Low latency and no aggregation. 不过,使用排队引入时也可以实现低延迟However, low latency can also be achieved with Queued ingestion
  • 使用同步方法时,方法完成即表示引入操作结束When synchronous methods are used, method completion indicates the end of the ingestion operation

直接模式的缺点包括:Disadvantages of the Direct mode include:

  • 客户端代码必须实现任何重试或错误处理逻辑The client code must implement any retry or error handling logic
  • 当 Kusto 引擎服务不可用时,无法进行引入Ingestions are impossible when the Kusto Engine service is unavailable
  • 客户端代码可能会大量向 Kusto 引擎服务发送引入请求,因为它不知道引擎服务容量The client code might overwhelm the Kusto Engine service with ingestion requests, since it isn't aware of the Engine service capacity

引入最佳做法Ingestion best practices

引入最佳做法提供了有关引入的 COG 和吞吐量 POV。Ingestion best practices provides COGs and throughput POV on ingestion.

  • 线程安全性 - Kusto 引入客户端实现是线程安全的,可以重复使用。Thread safety - Kusto Ingest Client implementations are thread-safe and intended to be reused. 无需为单个或多个引入操作创建 KustoQueuedIngestClient 类的实例。There's no need to create an instance of KustoQueuedIngestClient class for each or several ingest operations. 每个目标 Kusto 群集的每个用户进程需要 KustoQueuedIngestClient 的单个实例。A single instance of KustoQueuedIngestClient is required per target Kusto cluster per user process. 运行多个实例会适得其反,可能会导致数据管理群集上出现 DoS。Running multiple instances is counter-productive and may cause DoS on the Data Management cluster.

  • 支持的数据格式 - 使用原生引入时,如果其尚未存在,请将数据上传到一个或多个 Azure 存储 blob。Supported data formats - When using native ingestion, if not already there, upload the data to one or more Azure storage blobs. 目前支持的 blob 格式都记录在受支持的数据格式中。Currently supported blob formats are documented under Supported Data Formats.

  • 架构映射 - 架构映射有助于确定性地将源数据字段绑定到目标表列。Schema mapping - Schema mappings help with deterministically binding source data fields to destination table columns.

  • 引入权限 - Kusto 引入权限解释了使用 Kusto.Ingest 包成功进行引入所需的权限设置。Ingestion permissions - Kusto Ingestion Permissions explains permissions setup that is required for a successful ingestion using the Kusto.Ingest package.

  • 用法 - 如前所述,建议用于 Kusto 的可持续大规模引入解决方案的基础应该是 KustoQueuedIngestClientUsage - As described previously, the recommended basis for sustainable and high-scale ingestion solutions for Kusto should be the KustoQueuedIngestClient. 为了最大限度地减少 Kusto 服务上的不必要负载,建议你为每个 Kusto 群集的每个进程使用 Kusto 引入客户端(排队引入或直接引入)的单个实例。To minimize unnecessary load on your Kusto service, we recommended that you use a single instance of Kusto Ingest client (Queued or Direct) per process, per Kusto cluster. Kusto 引入客户端实现是线程安全的,并且是完全可重入的。Kusto ingest client implementation is thread-safe and fully reentrant.

后续步骤Next steps