跨群集联接Cross-cluster join

有关跨群集查询的常规讨论,请参阅跨群集查询或跨数据库查询For general discussion on cross-cluster queries, see cross-cluster or cross-database queries

可以对位于不同群集上的数据集执行联接操作。It's possible to do join operation on datasets residing on different clusters. 例如:For example:

T | ... | join (cluster("SomeCluster").database("SomeDB").T2 | ...) on Col1 // (1)

cluster("SomeCluster").database("SomeDB").T | ... | join (cluster("SomeCluster2").database("SomeDB2").T2 | ...) on Col1 // (2)

在上面的示例中,假设当前群集不是“SomeCluster”或“SomeCluster2”,则联接操作是跨群集联接。In the example above, the join operation is a cross-cluster join, assuming that current cluster isn't "SomeCluster" or "SomeCluster2".

在以下示例中:In the following example:

cluster("SomeCluster").database("SomeDB").T | ... | join (cluster("SomeCluster").database("SomeDB2").T2 | ...) on Col1 

联接操作不是跨群集联接,因为它的两个操作数都源自同一群集。the join operation isn't a cross-cluster join because both its operands originate on the same cluster.

当 Kusto 遇到跨群集联接时,它将自动决定执行联接操作的位置。When Kusto encounters a cross-cluster join, it will automatically decide where to execute the join operation itself. 此决定会出现以下三个可能的结果之一:This decision can have one of the three possible outcomes:

  • 在左操作数的群集上执行联接操作,该群集将首先提取右操作数。Execute join operation on the cluster of the left operand, right operand will be first fetched by this cluster. (示例 (1) 中的联接将在本地群集上执行)(join in example (1) will be executed on the local cluster)
  • 在右操作数的群集上执行联接操作,该群集将首先提取左操作数。Execute join operation on the cluster of the right operand, left operand will be first fetched by this cluster. (示例 (2) 中的联接将在“SomeCluster2”上执行)(join in example (2) will be executed on the "SomeCluster2")
  • 在本地执行联接操作(即在接收到查询的群集上执行),本地群集将首先提取这两个操作数。Execute join operation locally (meaning on the cluster that received the query), both operands will be first fetched by the local cluster.

实际决定取决于特定的查询。The actual decision depends on the specific query. 自动联接远程处理策略为(简化版本):“如果其中一个操作数是本地操作,则联接将在本地执行。The automatic join remoting strategy is (simplified version): "If one of the operands is local, join will be executed locally. 如果两个操作数都是远程操作,则联接将在右操作数的群集上执行”。If both operands are remote, join will be executed on the cluster of the right operand".

有时,如果不遵循自动远程处理策略,查询的性能可得到提高。Sometimes the performance of the query can be improved if automatic remoting strategy is not followed. 在这种情况下,在最大操作数的群集上执行联接操作。In this case, execute join operation on the cluster of the largest operand.

如果在示例 (1) 中,T | ... 生成的数据集比 cluster("SomeCluster").database("SomeDB").T2 | ... 生成的数据集小得多,则在“SomeCluster”上执行联接操作会更高效。If in example (1) the dataset produced by T | ... is much smaller than one produced by cluster("SomeCluster").database("SomeDB").T2 | ..., it is more efficient to execute join operation on "SomeCluster".

可以通过提供 Kusto 联接远程处理提示来完成此操作。This operation can be done by giving Kusto join remoting hint. 语法为:The syntax is:

T | ... | join hint.remote=<strategy> (cluster("SomeCluster").database("SomeDB").T2 | ...) on Col1

下面是 strategy 的合法值Following are legal values for strategy

  • left - 在左操作数的群集上执行联接left - execute join on the cluster of the left operand
  • right - 在右操作数的群集上执行联接right - execute join on the cluster of the right operand
  • local - 在当前群集的群集上执行联接local - execute join on the cluster of the current cluster
  • auto -(默认)让 Kusto 进行自动远程处理决策auto - (default) let Kusto make the automatic remoting decision

备注

如果提示的策略不适用于联接操作,则 Kusto 将忽略联接远程处理提示。The join remoting hint will be ignored by Kusto if the hinted strategy isn't applicable to the join operation.

Azure Monitor 不支持此功能This capability isn't supported in Azure Monitor