沙盒Sandboxes

Kusto 的数据引擎服务可为需要安全隔离的特定流运行沙盒。Kusto's Data Engine service can run sandboxes for specific flows that need secure isolation. 这些流的示例包括用户定义的使用 Python 插件R 插件运行的脚本。Examples of these flows are user-defined scripts that run using the Python plugin or the R plugin.

为了运行这些沙盒,Kusto 使用了 Microsoft 的 Drawbridge 项目的演化版本。To run these sandboxes, Kusto uses an evolved version of Microsoft's Drawbridge project. 其他 Microsoft 服务使用此解决方案在多租户环境中运行用户定义的对象。This solution is used by other Microsoft services to run user-defined objects in a multi-tenant environment.

在沙盒中运行的流不是隔离的。Flows that run in sandboxes aren't isolated. 它们也是本地的(靠近数据)。They're also local (close to the data). 这意味着不存在远程调用所增加的额外延迟。This means that there's no additional latency added for remote calls.

先决条件Prerequisites

  • 数据引擎不得启用磁盘加密The data engine mustn't have disk encryption enabled.
    • 将来应该会支持并行运行这两项功能。Support for both features running side by side is expected in the future.
  • 用于运行沙盒的所需包(映像)将部署到数据引擎的每个节点,并需要专用的 SSD 空间来运行The required packages (images) for running the sandboxes are deployed to each of the Data Engine's nodes, and require dedicated SSD space to run
    • 估计大小为 20GB,也就是 D14_v2 VM 的 SSD 容量的大约 2.5%,或 L16_v1 VM 的 SSD 容量的 0.7%。The estimated size is 20GB, that is roughly 2.5% the SSD capacity of a D14_v2 VM, for example, or 0.7% the SSD capacity of a L16_v1 VM.
    • 这会影响群集的数据容量,并可能会影响群集的成本This affects the cluster's data capacity, and may affect the cost of the cluster.

运行时Runtime

  • 沙盒化的查询运算符可以使用一个或多个沙盒来完成其执行。A sandboxed query operator may use one or more sandboxes for its execution.
    • 沙盒仅用于单个运行,不在多个运行之间共享,并且在该运行完成后会释放。A sandbox is only used for a single run, isn't shared across multiple runs, and is disposed of once that run completes.
    • 当查询首次需要沙盒来完成其执行时,沙盒会在节点上以延迟方式初始化。Sandboxes are lazily initialized on a node, the first time a query requires a sandbox for its execution.
      • 这意味着,在节点上使用沙盒的插件的第一次执行会经历一个短的预热期。This means that the first execution of a plugin that uses sandboxes on a node will include a short warm-up period.
    • 当节点重启时(例如,在服务升级过程中重启),将释放该节点上所有正在运行的沙盒。When a node is restarted, for example, as part of a service upgrade, all running sandboxes on it are disposed of.
  • 每个节点都维护预定义数量的沙盒,这些沙盒已就绪,可用于运行传入的请求。Each node maintains a predefined number of sandboxes that are ready for running incoming requests.
    • 使用某个沙盒后,将自动提供一个新的沙盒来替代它。Once a sandbox is used, a new one is automatically made available to replace it.
  • 如果没有预先分配的沙盒可用来为某个查询运算符提供服务,则会限制该查询运算符,直到有新的沙盒可用。If there are no pre-allocated sandboxes available to serve a query operator, it will be throttled until new sandboxes are available. 有关详细信息,请参阅错误For more information, see Errors. 对于每个沙盒,新的沙盒分配最多可能需要 10-15 秒,具体取决于 SKU 和数据节点上的可用资源。New sandbox allocation could take up to 10-15 seconds per sandbox, depending on the SKU and available resources on the data node.

限制Limitations

对于每种沙盒,可以使用群集级沙盒策略来控制某些限制。Some of the limitations can be controlled using a cluster-level sandbox policy, for each kind of sandbox.

  • 每个节点的沙盒数: 每个节点的沙盒数是有限的。Number of sandboxes per node: The number of sandboxes per node is limited.
    • 如果没有可用的沙盒,则会对发出的请求进行限制。Requests that are made when there's no available sandbox will be throttled.
  • 网络: 沙盒不能与虚拟机 (VM) 上的或其外部的任何资源交互。Network: A sandbox can't interact with any resource on the virtual machine (VM) or outside of it.
    • 沙盒不能与另一个沙盒交互。A sandbox can't interact with another sandbox.
  • CPU: 沙盒的主机处理器可供沙盒消耗的最大 CPU 速率是有限的(默认值为 50%)。CPU: The maximum rate of CPU a sandbox can consume of its host's processors is limited (default is 50%).
    • 达到此限制时,将限制沙盒对 CPU 的使用,但执行将继续。When the limit is reached, the sandbox's CPU use is throttled, but execution continues.
  • 内存: 沙盒的主机 RAM 可供沙盒消耗的最大 RAM 量是有限的(默认值为 20GB)。Memory: The maximum amount of RAM a sandbox can consume of its host's RAM is limited (default is 20GB).
    • 达到该限制将导致沙盒终止,并出现查询执行错误。Reaching the limit results in termination of the sandbox, and a query execution error.
  • 磁盘: 沙盒具有附加到它的唯一独立目录。Disk: A sandbox has a unique and independent directory attached to it. 它无法访问主机的文件系统。It can't access the host's file system.
    • 该唯一文件夹提供与沙盒类型匹配的映像/包的访问权限。The unique folder provides access to the image/package that matches the sandbox's type. 例如,不可自定义的 Python 或 R 包。For example, the non-customizable Python or R package.
  • 子进程: 将阻止沙盒生成子进程。Child processes: The sandbox is blocked from spawning child processes.

备注

与沙盒一起使用的资源不仅取决于作为请求的一部分进行处理的数据的大小,还取决于沙盒中运行的逻辑以及它所使用的库的实现。The resources used with sandbox depend not only on the size of the data being processed as part of the request, but also on the logic that runs in the sandbox, and the implementation of libraries being used by it. 例如,对于 pythonr 插件,后者是指用户提供的脚本以及脚本在运行时使用的 Python 或 R 库。For example, for the python and r plugins, the latter means the user-provided script and the Python or R libraries it consumes at runtime.

错误Errors

ErrorCodeErrorCode 状态Status MessageMessage 可能的原因Potential reason
E_SB_QUERY_THROTTLED_ERRORE_SB_QUERY_THROTTLED_ERROR TooManyRequests (429)TooManyRequests (429) 由于限制,沙盒查询被中止。The sandboxed query was aborted because of throttling. 在进行某些回退后重试可能会成功Retrying after some backoff might succeed 目标节点上没有可用的沙盒。There are no available sandboxes on the target node. 新沙盒应在几秒钟内变得可用New sandboxes should become available in a few seconds
E_SB_QUERY_THROTTLED_ERRORE_SB_QUERY_THROTTLED_ERROR TooManyRequests (429)TooManyRequests (429) “{kind}”类型的沙盒尚未初始化Sandboxes of kind '{kind}' haven't yet been initialized 沙盒策略最近发生了更改。The sandbox policy has recently changed. 遵守新策略的新沙盒将在几秒钟内变得可用New sandboxes obeying the new policy will become available in a few seconds
InternalServiceError (520)InternalServiceError (520) 由于初始化沙盒时出错,沙盒查询已中止The sandboxed query was aborted due to a failure in initializing sandboxes 意外的基础结构故障。An unexpected infrastructure failure. 如果问题仍然存在,请创建支持请求If the issue persists - please open a support request