lookup 运算符lookup operator

lookup 运算符使用在维度表中查找的值扩展事实数据表的列。The lookup operator extends the columns of a fact table with values looked-up in a dimension table.

FactTable | lookup kind=leftouter (DimensionTable) on CommonColumn, $left.Col1 == $right.Col2

此处,结果是一个表,它根据后一个表中的每对 (CommonColumn1,Col2) 查找前一个表中的每对 (CommonColumn,Col),使用来自 DimensionTable(通过 $right 进行引用)的数据来扩展 FactTable ($left)。Here, the result is a table that extends the FactTable ($left) with data from DimensionTable (referenced by $right) by performing a lookup of each pair (CommonColumn,Col) from the former table with each pair (CommonColumn1,Col2) in the latter table. 有关事实数据表与维度表之间的差异,请参阅事实数据表与维度表For the differences between fact and dimension tables, see fact and dimension tables.

lookup 运算符执行的操作类似于 join 运算符,但存在以下差异:The lookup operator performs an operation similar to the join operator with the following differences:

  • 结果不会重复 $right 表中作为联接操作基础的列。The result does not repeat columns from the $right table that are the basis for the join operation.
  • 仅支持两种查找:leftouterinner,并且 leftouter 为默认值。Only two kinds of lookup are supported, leftouter and inner, with leftouter being the default.
  • 从性能方面考虑,系统默认情况下假定 $left 表是较大的(事实数据)表,而 $right 表是较小的(维度)表。In terms of performance, the system by default assumes that the $left table is the larger (facts) table, and the $right table is the smaller (dimensions) table. 这与 join 运算符使用的假设完全相反。This is exactly opposite to the assumption used by the join operator.
  • lookup 运算符自动将 $right 表广播到 $left 表(实质上,其行为与指定了 hint.broadcast 时的行为相同)。The lookup operator automatically broadcasts the $right table to the $left table (essentially, behaves as if hint.broadcast was specified). 请注意,这会限制 $right 表的大小。Note that this limits the size of the $right table.

语法Syntax

LeftTable | lookup [kind = (leftouter|inner)] ( RightTable ) on AttributesLeftTable | lookup [kind = (leftouter|inner)] ( RightTable ) on Attributes

参数Arguments

  • LeftTable:用作查找基础的表或表格表达式。LeftTable: The table or tabular expression that is the basis for the lookup. 表示为 $leftDenoted as $left.

  • RightTable:用于“填充”事实数据表中的新列的表或表格表达式。RightTable: The table or tabular expression that is used to "populate" new columns in the fact table. 表示为 $rightDenoted as $right.

  • Attributes:一个或多个规则的逗号分隔列表,这些规则描述 LeftTable 中的行如何与 RightTable 中的行进行匹配。Attributes: A comma-delimited list of one or more rules that describe how rows from LeftTable are matched to rows from RightTable. 将使用 and 逻辑运算符评估多个规则。Multiple rules are evaluated using the and logical operator. 规则可以是下列项之一:A rule can be one of:

    规则类型Rule kind 语法Syntax PredicatePredicate
    基于名称的等式Equality by name ColumnNameColumnName where LeftTable.ColumnName == RightTable.ColumnNamewhere LeftTable.ColumnName == RightTable.ColumnName
    基于值的等式Equality by value $left.LeftColumn == $right.RightColumn$left.LeftColumn == $right.RightColumn where $left.LeftColumn == $right.*RightColumnwhere $left.LeftColumn == $right.*RightColumn

    备注

    如果使用“基于值的等式”,则列名称必须通过由 $left$right 表示法表示的相应所有者表进行限定。In case of 'equality by value', the column names must be qualified with the applicable owner table denoted by $left and $right notations.

  • kind:一个可选说明,指示如何处理 LeftTable 中在 RightTable 中没有匹配项的行。kind: An optional instruction on how to treat rows in LeftTable that have no match in RightTable. 默认情况下将使用 leftouter,这意味着所有这些行都将出现在输出中,对于此运算符添加的 RightTable 列的缺失值,将使用 null 值。By default, leftouter is used, which means all those rows will appear in the output with null values used for the missing values of RightTable columns added by the operator. 如果使用 inner,则输出中将省略此类行。If inner is used, such rows are omitted from the output. lookup 运算符不支持其他种类的联接。)(Other kinds of join are not supported by the lookup operator.)

返回Returns

具有以下内容的表:A table with:

  • 一列,用于每个表中的每一列,包括匹配键。A column for every column in each of the two tables, including the matching keys. 如果存在名称冲突,会自动重命名右侧的列。The columns of the right side will be automatically renamed if there are name conflicts.

  • 一行,用于输入表之间的所有匹配项。A row for every match between the input tables. 匹配项是从表中选出的一行,该表中的所有 on 字段值与其他表中的值相同。A match is a row selected from one table that has the same value for all the on fields as a row in the other table.

  • 属性(查找键)在输出表中将只出现一次。The Attributes (lookup keys) will appear only once in the output table.

  • kind 未指定,kind=leftouterkind unspecified, kind=leftouter

    除了内部匹配,还有一行用于左侧(和/或右侧)的每一行(即使没有匹配项)。In addition to the inner matches, there's a row for every row on the left (and/or right), even if it has no match. 在这种情况下,不匹配的输出单元格包含 null。In that case, the unmatched output cells contain nulls.

  • kind=inner

    输出中存在一行,用于左侧和右侧匹配行的每个组合。There's a row in the output for every combination of matching rows from left and right.

示例Examples

let FactTable=datatable(Row:string,Personal:string,Family:string) [
  "1", "Bill",   "Gates",
  "2", "Bill",   "Clinton",
  "3", "Bill",   "Clinton",
  "4", "Steve",  "Ballmer",
  "5", "Tim",    "Cook"
];
let DimTable=datatable(Personal:string,Family:string,Alias:string) [
  "Bill",  "Gates",   "billg",
  "Bill",  "Clinton", "billc",
  "Steve", "Ballmer", "steveb",
  "Tim",   "Cook",    "timc"
];
FactTable
| lookup kind=leftouter DimTable on Personal, Family
Row 个人Personal 系列Family AliasAlias
11 BillBill GatesGates billgbillg
22 BillBill ClintonClinton billcbillc
33 BillBill ClintonClinton billcbillc
44 SteveSteve BallmerBallmer stevebsteveb
55 TimTim CookCook timctimc