median
aggregate function
Applies to: Databricks SQL Databricks Runtime 11.3 LTS and above
Returns the median calculated from values of a group.
Syntax
median ( [ALL | DISTINCT] expr ) [FILTER ( WHERE cond ) ]
This function can also be invoked as a window function using the OVER
clause.
Arguments
expr
: An expression that evaluates to a numeric or an interval.cond
: An optional boolean expression filtering the rows used for aggregation.
Returns
The following explains how the result types are computed:
- year-month interval: The result is an
INTERVAL YEAR TO MONTH
. - day-time interval: The result is an
INTERVAL DAY TO SECOND
. - In all other cases the result is a DOUBLE.
Nulls within the group are ignored. If a group is empty or consists only of nulls, the result is NULL
.
If DISTINCT
is specified, duplicates are removed and the median is computed.
This function is a synonym for percentile_cont(0.5) WITHIN GROUP (ORDER BY expr).
Examples
> SELECT median(col) FROM VALUES (1), (2), (2), (3), (4), (NULL) AS tab(col);
2.0
> SELECT median(DISTINCT col) FROM VALUES (1), (2), (2), (3), (4), (NULL) AS tab(col);
2.5
> SELECT median(col) FROM VALUES (INTERVAL '1' YEAR), (INTERVAL '2' YEAR) AS tab(col);
1-6