开窗函数概述Window functions overview

开窗函数每次对一个行集中的多个行(记录)进行运算。Window functions operate on multiple rows (records) in a row set at a time. 与聚合函数不同,开窗函数要求对行集中的行进行序列化(使这些行具有特定的顺序)。Unlike aggregation functions, window functions require that the rows in the row set be serialized (have a specific order to them). 开窗函数可能要依赖于顺序来确定结果。Window functions may depend on the order to determine the result.

只能对已序列化的集使用开窗函数。Window functions can only be used on serialized sets. 序列化行集的最简单方法是使用 serialize 运算符The easiest way to serialize a row set is to use the serialize operator. 此运算符以任意方式“冻结”行的顺序。This operator "freezes" the order of rows in an arbitrary manner. 如果已序列化的行的顺序在语义上很重要,请使用 sort 运算符来强制实施某个特定顺序。If the order of serialized rows is semantically important, use the sort operator to force a particular order.

与序列化进程所关联的成本并非微不足道。The serialization process has a non-trivial cost associated with it. 例如,在许多情形下,它可能会阻止查询并行。For example, it might prevent query parallelism in many scenarios. 因此,在不必要的情况下,请不要应用序列化。Therefore, don't apply serialization unnecessarily. 如有必要,请重新安排查询,以针对尽可能小的行集执行序列化。If necessary, rearrange the query to perform serialization on the smallest row set possible.

已序列化的行集Serialized row set

可通过以下方法之一序列化任意行集(如表,或表格运算符的输出):An arbitrary row set (such as a table, or the output of a tabular operator) can be serialized in one of the following ways:

  1. 通过对行集排序。By sorting the row set. 有关发出已排序行集的运算符的列表,请参阅下文。See below for a list of operators that emit sorted row sets.
  2. 通过使用 serialize 运算符By using the serialize operator.

只要输入已经序列化,许多表格运算符就会序列化输出,即使运算符本身并不确保对结果进行序列化也是如此。Many tabular operators serialize output whenever the input is already serialized, even if the operator doesn't itself guarantee that the result is serialized. 例如,对于 extend 运算符project 运算符where 运算符,这一属性有保证。For example, this property is guaranteed for the extend operator, the project operator, and the where operator.

发出已通过排序完成了序列化的行集的运算符Operators that emit serialized row sets by sorting

保留已序列化行集属性的运算符Operators that preserve the serialized row set property