series_outliers()series_outliers()

对序列中的异常点进行评分。Scores anomaly points in a series.

该函数接受一个带有动态数值数组的表达式作为输入,生成一个相同长度的动态数值数组。The function takes an expression with a dynamic numerical array as input, and generates a dynamic numeric array of the same length. 数组的每个值都表示使用 Tukey 测试时可能出现异常的分数。Each value of the array indicates a score of a possible anomaly, using "Tukey's test". 相同输入元素中大于 1.5 的值表示异常增加,A value greater than 1.5 in the same element of the input indicates a rise or decline anomaly. 小于 -1.5 的值表示异常减少。A value less than -1.5, indicates a decline anomaly.

语法Syntax

series_outliers(x, kind, ignore_val, min_percentile, max_percentile)series_outliers(x, kind, ignore_val, min_percentile, max_percentile)

参数Arguments

  • x:动态数组单元格,是数值数组x: Dynamic array cell that is an array of numeric values
  • kind:离群值检测算法。kind: Algorithm of outlier detection. 目前支持 "tukey"(传统“Tukey”)和 "ctukey"(自定义“Tukey”)。Currently supports "tukey" (traditional "Tukey") and "ctukey" (custom "Tukey"). 默认为 "ctukey"Default is "ctukey"
  • ignore_val:一个数值,表示序列中缺少的值。ignore_val: Numeric value indicating missing values in the series. 默认值为“double(null)”。Default is double(null). NULL 和忽略值的分数将设置为 0The score of nulls and ignore values is set to 0
  • min_percentile:用于计算归一化四分位差。min_percentile: For calculating the normal inter-quantile range. 默认值为 10,受支持的自定义值在 [2.0, 98.0] 范围内(仅限 ctukeyDefault is 10, custom values supported are in range [2.0, 98.0] (ctukey only)
  • max_percentile:同样,默认值为 90,受支持的自定义值在 [2.0, 98.0] 范围内(仅限 ctukey)max_percentile: same, default is 90, custom values supported are in range [2.0, 98.0] (ctukey only)

下表描述了 "tukey""ctukey" 之间的差异:The following table describes differences between "tukey" and "ctukey":

算法Algorithm 默认分位范围Default quantile range 支持自定义分位范围Supports custom quantile range
"tukey" 25% / 75%25% / 75% No
"ctukey" 10% / 90%10% / 90% Yes

提示

使用此函数的最佳方式是将其应用于 make-series 运算符的结果。The best way to use this function is to apply it to the results of the make-series operator.

示例Example

带有某些干扰信息的时序会产生离群值。A time series with some noise creates outliers. 若要将这些离群值(干扰信息)替换为平均值,请使用 series_outliers() 来检测出离群值,然后将其替换。If you would like to replace those outliers (noise) with the average value, use series_outliers() to detect the outliers, and then replace them.

range x from 1 to 100 step 1 
| extend y=iff(x==20 or x==80, 10*rand()+10+(50-x)/2, 10*rand()+10) // generate a sample series with outliers at x=20 and x=80
| summarize x=make_list(x),series=make_list(y)
| extend series_stats(series), outliers=series_outliers(series)
| mv-expand x to typeof(long), series to typeof(double), outliers to typeof(double)
| project x, series , outliers_removed=iff(outliers > 1.5 or outliers < -1.5, series_stats_series_avg , series ) // replace outliers with the average
| render linechart

序列离群值