collect_set
aggregate function
Applies to: Databricks SQL Databricks Runtime
Returns an array consisting of all unique values in expr
within the group.
Syntax
collect_set(expr) [FILTER ( WHERE cond ) ]
This function can also be invoked as a window function using the OVER
clause.
Arguments
expr
: An expression of any type exceptMAP
.cond
: An optional boolean expression filtering the rows used for aggregation.
Returns
An ARRAY of the argument type.
The order of elements in the array is non-deterministic. NULL values are excluded.
Examples
> SELECT collect_set(col) FROM VALUES (1), (2), (NULL), (1) AS tab(col);
[1,2]
> SELECT collect_set(col1) FILTER(WHERE col2 = 10)
FROM VALUES (1, 10), (2, 10), (NULL, 10), (1, 10), (3, 12) AS tab(col1, col2);
[1,2]