session_window 分组表达式

适用于:check marked yes Databricks SQL check marked yes Databricks Runtime 10.0 及更高版本

通过时间戳表达式创建会话窗口。

语法

session_window(expr, gapDuration)

参数

  • expr:一个 TIMESTAMP 表达式,用于指定窗口的主题。
  • gapDuration:一个 STRING 表达式,用于将窗口的宽度表示为 INTERVAL DAY TO SECOND 文本。

返回

返回可以使用聚合函数对其执行操作的分组集。 GROUP BY 列名称为 session_window。 它的类型为 STRUCT<start:TIMESTAMP, end:TIMESTAMP>

示例

> SELECT a, session_window.start, session_window.end, count(*) as cnt
    FROM VALUES ('A1', '2021-01-01 00:00:00'),
                ('A1', '2021-01-01 00:04:30'),
                ('A1', '2021-01-01 00:10:00'),
                ('A2', '2021-01-01 00:01:00') AS tab(a, b)
    GROUP by a, session_window(b, '5 minutes')
    ORDER BY a, start;
  A1  2021-01-01 00:00:00  2021-01-01 00:09:30  2
  A1  2021-01-01 00:10:00  2021-01-01 00:15:00  1
  A2  2021-01-01 00:01:00  2021-01-01 00:06:00  1

> SELECT a, session_window.start, session_window.end, count(*) as cnt
    FROM VALUES ('A1', '2021-01-01 00:00:00'),
                ('A1', '2021-01-01 00:04:30'),
                ('A1', '2021-01-01 00:10:00'),
                ('A2', '2021-01-01 00:01:00'),
                ('A2', '2021-01-01 00:04:30') AS tab(a, b)
    GROUP by a, session_window(b, CASE WHEN a = 'A1' THEN '5 minutes'
                                       WHEN a = 'A2' THEN '1 minute'
                                       ELSE '10 minutes' END)
    ORDER BY a, start;
  A1  2021-01-01 00:00:00  2021-01-01 00:09:30  2
  A1  2021-01-01 00:10:00  2021-01-01 00:15:00  1
  A2  2021-01-01 00:01:00  2021-01-01 00:02:00  1
  A2  2021-01-01 00:04:30  2021-01-01 00:05:30  1