映射数据流中的窗口转换Window transformation in mapping data flow

适用于: Azure 数据工厂 Azure Synapse Analytics

在窗口转换中,将定义数据流中基于窗口的列的聚合。The Window transformation is where you will define window-based aggregations of columns in your data streams. 在表达式生成器中,可以定义基于数据或时间窗口(SQL OVER 子句)的不同类型的聚合,如 LEAD、LAG、NTILE、CUMEDIST、RANK 等。In the Expression Builder, you can define different types of aggregations that are based on data or time windows (SQL OVER clause) such as LEAD, LAG, NTILE, CUMEDIST, RANK, etc.). 输出中将生成包含这些聚合的新字段。A new field will be generated in your output that includes these aggregations. 还可以包含可选的分组字段。You can also include optional group-by fields.

屏幕截图显示从菜单选择了“窗口化”。Screenshot shows Windowing selected from the menu.

OverOver

为窗口转换设置列数据的分区。Set the partitioning of column data for your window transformation. SQL 等效项是 SQL 中 Over 子句中的 Partition ByThe SQL equivalent is the Partition By in the Over clause in SQL. 如果希望创建用于分区的计算或表达式,可以将鼠标悬停在列名上并选择“计算列”。If you wish to create a calculation or create an expression to use for the partitioning, you can do that by hovering over the column name and select "computed column".

屏幕截图显示了“窗口设置”,其中已选中“Over”选项卡。Screenshot shows Windowing Settings with the Over tab selected.

排序Sort

Over 子句的另一部分是设置 Order ByAnother part of the Over clause is setting the Order By. 该操作将设置数据排序顺序。This will set the data sort ordering. 还可以在此列字段中为计算值创建表达式以进行排序。You can also create an expression for a calculate value in this column field for sorting.

屏幕截图显示了“窗口设置”,其中已选中“Sort”选项卡。Screenshot shows Windowing Settings with the Sort tab selected.

范围Range By

接下来,将窗口框架设置为“无界”或“有界”。Next, set the window frame as Unbounded or Bounded. 若要设置无界窗框,请将滑块两端设置为“无界”。To set an unbounded window frame, set the slider to Unbounded on both ends. 如果在无界和当前行之间选择一个设置,那么必须设置偏移量的开始值和结束值。If you choose a setting between Unbounded and Current Row, then you must set the Offset start and end values. 这两个值都是正整数。Both values will be positive integers. 可以使用数据中的相对数字或值。You can use either relative numbers or values from your data.

窗口滑块有两个要设置的值:当前行之前的值和当前行之后的值。The window slider has two values to set: the values before the current row and the values after the current row. “开始”和“结束”偏移与滑块上的两个选择器相匹配。The Start and End offset matches the two selectors on the slider.

屏幕截图显示了“窗口设置”,其中已选中“Range by”选项卡。Screenshot shows Windowing Settings with the Range by tab selected.

窗口列Window columns

最后,使用表达式生成器定义你希望与数据窗口一起使用的聚合,例如 RANK、COUNT、MIN、MAX、DENSE RANK、LEAD 和 LAG 等。Lastly, use the Expression Builder to define the aggregations you wish to use with the data windows such as RANK, COUNT, MIN, MAX, DENSE RANK, LEAD, LAG, etc.

屏幕截图显示了窗口操作的结果。Screenshot shows the result of the windowing action.

此处列出了可通过表达式生成器采用 ADF 数据流表达式语言使用的聚合和分析函数的完整列表:https://docs.azure.cn/data-factory/data-flow-expression-functions。The full list of aggregation and analytical functions available for you to use in the ADF Data Flow Expression Language via the Expression Builder are listed here: https://docs.azure.cn/data-factory/data-flow-expression-functions.

后续步骤Next steps

如果正在查找简单的分组依据聚合,请使用聚合转换If you are looking for a simple group-by aggregation, use the Aggregate transformation