series_outliers()series_outliers()
对序列中的异常点进行评分。Scores anomaly points in a series.
该函数接受一个带有动态数值数组的表达式作为输入,生成一个相同长度的动态数值数组。The function takes an expression with a dynamic numerical array as input, and generates a dynamic numeric array of the same length. 数组的每个值都表示使用 Tukey 测试时可能出现异常的分数。Each value of the array indicates a score of a possible anomaly, using "Tukey's test". 相同输入元素中大于 1.5 的值表示异常增加,A value greater than 1.5 in the same element of the input indicates a rise or decline anomaly. 小于 -1.5 的值表示异常减少。A value less than -1.5, indicates a decline anomaly.
语法Syntax
series_outliers(
x,
kind,
ignore_val,
min_percentile,
max_percentile)
series_outliers(
x,
kind,
ignore_val,
min_percentile,
max_percentile)
参数Arguments
- x:动态数组单元格,是数值数组x : Dynamic array cell that is an array of numeric values
- kind :离群值检测算法。kind : Algorithm of outlier detection. 目前支持
"tukey"
(传统“Tukey”)和"ctukey"
(自定义“Tukey”)。Currently supports"tukey"
(traditional "Tukey") and"ctukey"
(custom "Tukey"). 默认为"ctukey"
Default is"ctukey"
- ignore_val:一个数值,表示序列中缺少的值。ignore_val : Numeric value indicating missing values in the series. 默认值为“double(null)”。Default is double(null). NULL 和忽略值的分数将设置为
0
The score of nulls and ignore values is set to0
- min_percentile:用于计算归一化四分位差。min_percentile : For calculating the normal inter-quantile range. 默认值为 10,受支持的自定义值在
[2.0, 98.0]
范围内(仅限ctukey
)Default is 10, custom values supported are in range[2.0, 98.0]
(ctukey
only) - max_percentile:同样,默认值为 90,受支持的自定义值在
[2.0, 98.0]
范围内(仅限 ctukey)max_percentile : same, default is 90, custom values supported are in range[2.0, 98.0]
(ctukey only)
下表描述了 "tukey"
和 "ctukey"
之间的差异:The following table describes differences between "tukey"
and "ctukey"
:
算法Algorithm | 默认分位范围Default quantile range | 支持自定义分位范围Supports custom quantile range |
---|---|---|
"tukey" |
25% / 75%25% / 75% | 否No |
"ctukey" |
10% / 90%10% / 90% | 是Yes |
提示
使用此函数的最佳方式是将其应用于 make-series 运算符的结果。The best way to use this function is to apply it to the results of the make-series operator.
示例Example
带有某些干扰信息的时序会产生离群值。A time series with some noise creates outliers. 若要将这些离群值(干扰信息)替换为平均值,请使用 series_outliers() 来检测出离群值,然后将其替换。If you would like to replace those outliers (noise) with the average value, use series_outliers() to detect the outliers, and then replace them.
range x from 1 to 100 step 1
| extend y=iff(x==20 or x==80, 10*rand()+10+(50-x)/2, 10*rand()+10) // generate a sample series with outliers at x=20 and x=80
| summarize x=make_list(x),series=make_list(y)
| extend series_stats(series), outliers=series_outliers(series)
| mv-expand x to typeof(long), series to typeof(double), outliers to typeof(double)
| project x, series , outliers_removed=iff(outliers > 1.5 or outliers < -1.5, series_stats_series_avg , series ) // replace outliers with the average
| render linechart