广播联接Broadcast join

如今,常规联接在单个集群节点上执行。Today, regular joins are executed on a single cluster node. 广播联接是一种联接的执行策略,该策略分布在各群集节点上。Broadcast join is an execution strategy of join, which will distribute it over cluster nodes. 在联接的左侧较小(最多 100,000 条记录)时,此策略很有用。This strategy is useful when left side of the join is small (up to 100 K records). 在这种情况下,广播联接比常规联接的性能更高。In this case, broadcast join will be more performant than regular join.

如果联接的左侧是一个小型数据集,可使用以下语法 (hint.strategy = broadcast) 在广播模式下运行联接:If left side of the join is a small dataset, then you may run join in broadcast mode using the following syntax (hint.strategy = broadcast):

lookupTable 
| join hint.strategy = broadcast (factTable) on key

在联接后跟其他运算符(如 summarize)的情况下,性能改进将更加明显。Performance improvement will be more noticeable in scenarios where the join is followed by other operators such as summarize. 例如,在以下查询中:for example in this query:

lookupTable 
| join hint.strategy = broadcast (factTable) on Key
| summarize dcount(Messages) by Timestamp, Key