包含常见词汇和概念的指标顾问术语表Metrics Advisor glossary of common vocabulary and concepts

本文档介绍指标顾问中使用的技术术语。This document explains the technical terms used in Metrics Advisor. 使用本文可以了解使用服务时可能遇到的常见概念和对象。Use this article to learn about common concepts and objects you might encounter when using the service .

数据馈送Data feed

备注

多个指标可以共享相同的数据源,甚至可以共享相同的数据馈送。Multiple metrics can share the same data source, and even the same data feed.

数据馈送是指标顾问从数据源(如 Cosmos DB 或 SQL 服务器)中引入的内容。A data feed is what Metrics Advisor ingests from your data source, such as Cosmos DB or a SQL server. 数据馈送包含以下行:A data feed contains rows of:

  • 时间戳timestamps
  • 零个或多个维度zero or more dimensions
  • 一个或多个度量值。one or more measures.

指标Metric

指标是一种可量化的度量值,用于监视和评估特定业务流程的状态。A metric is a quantifiable measure that is used to monitor and assess the status of a specific business process. 它可以是划分为多个维度的多个时序值的组合。It can be a combination of multiple time series values divided into dimensions. 例如,Web 运行状况指标可能包含用户数和 en-us 市场的维度 。For example a web health metric might contain dimensions for user count and the en-us market.

维度Dimension

维度是一个或多个分类值。A dimension is one or more categorical values. 这些值的组合标识特定的单变量时序,例如国家/地区、语言、租户等等。The combination of those values identify a particular univariate time series, for example: country, language, tenant, and so on.

多维度指标Multi-dimensional metric

什么是多维度指标?What is a multi-dimension metric? 让我们使用两个示例。Let's use two examples.

业务收入Revenue of a business

假设你有业务收入数据。Suppose you have data for the revenue of your business. 时序数据可能如下所示:Your time series data might look something like this:

时间戳Timestamp 类别Category 市场Market 收入Revenue
2020-6-12020-6-1 食物Food USUS 10001000
2020-6-12020-6-1 服装Apparel USUS 20002000
2020-6-22020-6-2 食物Food 英国UK 800800
...... ...... ...... ......

在此示例中,“类别”和“市场”是维度 。In this example, Category and Market are dimensions. 收入是关键绩效指标 (KPI),可以分为不同的类别和/或市场,也可以聚合。Revenue is the Key Performance Indicator (KPI) which could be sliced into different categories and/or markets, and could also be aggregated. 例如,所有市场的食品的收入。For example, the revenue of food for all markets.

复杂应用程序的错误计数Error counts for a complex application

假设你有应用程序中记录的错误数的数据。Suppose you have data for the number of errors logged in an application. 时序数据可能如下所示:Your time series data might look something like this:

时间戳Timestamp 应用程序组件Application component 区域Region 错误计数Error count
2020-6-12020-6-1 员工数据库Employee database 中国东部CHINA EAST 90009000
2020-6-12020-6-1 消息队列Message queue 中国北部CHINA NORTH 10001000
2020-6-22020-6-2 消息队列Message queue 中国北部CHINA NORTH 80008000
...... ...... ...... ......

在此示例中,“应用程序组件”和“区域”是维度 。In this example, Application component and Region are dimensions. 错误计数是 KPI,可以分为不同的类别和/或市场,也可以聚合。Error count is the KPI which could be sliced into different categories and/or markets, and could also be aggregated. 例如,所有区域中消息队列的错误计数。For example, the error count of Message queue in all regions.

测量Measure

度量值是指标的基本或单位特定术语和可量化的值。A measure is a fundamental or unit-specific term and a quantifiable value of the metric.

时序Time series

时序是按时间顺序编制索引(或列出或绘制)的一系列数据点。A time series is a series of data points indexed (or listed or graphed) in chronological order. 时序通常是在连续的、均匀间距的时间点生成的序列。Most commonly, a time series is a sequence taken at successive, equally spaced points in time. 它是一系列离散时间数据。It is a sequence of discrete-time data.

在指标顾问中,特定维度组合上的一个指标的值称为一个系列。In Metrics Advisor, values of one metric on a specific dimension combination is called one series.

粒度Granularity

粒度指示数据源处生成数据点的频率。Granularity indicates how frequent data points will be generated at the data source. 例如“每日”、“每小时”。For example, daily, hourly.

开始时间Start time

开始时间是你希望指标顾问开始从数据源中引入数据的时间。Start time is the time that you want Metrics Advisor to begin ingesting data from your data source. 数据源必须在指定的开始时间有数据。Your data source must have data at the specified start time.

置信度边界Confidence boundaries

备注

置信度边界不是用于查找异常的唯一度量。Confidence boundaries are not the only measurement used to find anomalies. 检测模型可能将此边界外的数据点标记为正常。It's possible for data points outside of this boundary to be flagged as normal by the detection model.

在指标顾问中,置信度边界表示使用的算法的敏感度,并用于筛选掉过于敏感的异常。In Metrics Advisor, confidence boundaries represent the sensitivity of the algorithm used, and are used to filter out overly sensitive anomalies. 在 Web 门户上,置信度边界显示为透明蓝色带。On the web portal, confidence bounds appear as a transparent blue band. 带内的所有点都被视为正常点。All the points within the band are treated as normal points.

指标顾问提供工具来调整使用的算法的敏感度。Metrics Advisor provides tools to adjust the sensitivity of the algorithms used. 请参阅如何:配置指标并微调检测配置,了解详细信息。See How to: Configure metrics and fine tune detecting configuration for more information.

置信度边界

挂钩Hook

通过指标顾问,可以创建和订阅实时警报。Metrics Advisor lets you create and subscribe to real-time alerts. 这些警报使用挂钩通过 Internet 发送。These alerts are sent over the internet, using a hook.

异常事件Anomaly incident

将检测配置应用于指标后,每当配置中任何序列出现异常时,都会生成事件。After a detection configuration is applied to metrics, incidents are generated whenever any series within it has an anomaly. 在大型数据集中,这可能非常庞大,因此指标顾问将指标中的一系列异常归入事件。In large data sets this can be overwhelming, so Metrics Advisor groups series of anomalies within a metric into an incident. 该服务还将评估严重性,并提供用于诊断事件的工具。The service will also evaluate the severity and provide tools for diagnosing the incident.

事件树Incident tree

在指标顾问中,可以对指标应用异常检测,然后指标顾问会自动监视所有维度组合的所有时序。In Metrics Advisor, you can apply anomaly detection on metrics, then Metrics Advisor automatically monitors all time series of all dimension combinations. 每当检测到任何异常时,指标顾问都会将异常聚合到事件中。Whenever there is any anomaly detected, Metrics Advisor aggregates anomalies into incidents. 事件发生后,指标顾问将提供事件树(包含涉及的异常的层次结构),并标识影响最大的异常。After an incident occurs, Metrics Advisor will provide an incident tree with a hierarchy of contributing anomalies, and identify ones with the biggest impact. 每个事件都有一个根本原因异常,这是树的顶部节点。Each incident has a root cause anomaly, which is the top node of the tree.

异常分组Anomaly grouping

指标顾问提供查找具有类似模式的相关时序的功能。Metrics Advisor provides the capability to find related time series with a similar patterns. 它还可以提供有关针对其他维度影响的更深入见解并关联多种异常。It can also provide deeper insights into the impact on other dimensions, and correlate the anomalies.

时序比较Time series comparison

可以选择多个时序来比较单个可视化中的趋势。You can pick multiple time series to compare trends in a single visualization. 这提供了一个清晰且富有见解的方式来查看和比较相关系列。This provides a clear and insightful way to view and compare related series.

检测配置Detection configuration

备注

检测配置仅在单个指标中应用。Detection configurations are only applied within an individual metric.

在指标顾问 Web 门户上查看指标时,左侧面板上会列出检测配置(如敏感度、自动推迟和方向)。On the Metrics Advisor web portal, a detection configuration (such as sensitivity, auto snooze, and direction) is listed on the left panel when viewing a metric. 参数可以调整并应用于此指标内的所有系列。Parameters can be tuned and applied to all series within this metric.

检测配置是每个时序所需的,它可以确定时序中的点是否为异常。A detection configuration is required for every time series, and determines whether a point in the time series is an anomaly. 指标顾问将在你首次载入数据时为整个指标设置默认配置。Metrics Advisor will set up a default configuration for the whole metric when you first onboard data.

此外,还可以通过对一组序列或特定序列应用优化参数来优化配置。You can additionally refine the configuration by applying tuning parameters on a group of series, or a specific one. 一个时序只能应用一个配置:Only one configuration will be applied to a time series:

  • 应用于特定序列的配置将覆盖组的配置Configurations applied to a specific series will overwrite configurations for a group
  • 组的配置将覆盖应用于整个指标的配置。Configurations for a group will overwrite configurations applied to the whole metric.

指标顾问提供若干检测方法,你可以使用逻辑运算符来组合它们。Metrics Advisor provides several detection methods, and you can combine them using logical operators.

智能检测Smart detection

使用多个机器学习算法的异常检测。Anomaly detection using multiple machine learning algorithms.

敏感度:用于调整异常检测的容差的数值。Sensitivity: A numerical value to adjust the tolerance of the anomaly detection. 直观而言,值越高,时序的上限和下限越窄。Visually, the higher the value, the narrower the upper and lower boundaries around the time series.

硬阈值Hard threshold

上限或下限以外的值是异常。Values outside of upper or lower bounds are anomalies.

最小值:下限Min: The lower bound

最大值:上限Max: The upper bound

变化阈值Change threshold

使用上一个点值确定此点是否为异常。Use the previous point value to determine if this point is an anomaly.

变化百分比:与上一个点相比,如果变化百分比大于此参数,则当前点是异常。Change percentage: Compared to the previous point, the current point is an anomaly if the percentage of change is more than this parameter.

变化所涉及的点:要回顾的点数。Change over points: How many points to look back.

通用参数Common parameters

方向:仅当偏差发生的方向为“上”和/或“下”时,点为异常 。Direction: A point is an anomaly only when the deviation occurs in the direction up, down, or both.

在此之前不是有效异常:某一数据点仅在该点之前的指定百分比的数据点也是异常的情况下是异常。Not valid anomaly until: A data point is only an anomaly if a specified percentage of previous points are also anomalies.

警报设置Alert settings

警报设置确定哪些异常应触发警报。Alert settings determine which anomalies should trigger an alert. 可以使用不同的设置设置多个警报。You can set multiple alerts with different settings. 例如,可以为业务影响较小的异常创建警报,也可以创建更重要的警报。For example, you could create an alert for anomalies with lower business impact, and another for more importance alerts.

还可以跨指标创建警报。You can also create an alert across metrics. 例如,仅在两个指定的指标有异常时触发的警报。For example, an alert that only gets triggered if two specified metrics have anomalies.

警报范围Alert scope

警报范围是指警报的适用范围。Alert scope refers to the scope that the alert applies to. 下面有四个选择:There are four options:

所有序列的异常:为指标中所有序列中的异常触发警报。Anomalies of all series: Alerts will be triggered for anomalies in all series within the metric.

序列组中的异常:仅针对序列组特定维度中的异常触发警报。Anomalies in series group: Alerts will only be triggered for anomalies in specific dimensions of the series group. 指定维度的数量应小于维度总数。The number of specified dimensions should be smaller than the total number dimensions.

偏好序列中的异常:仅对添加为偏好的异常触发警报。Anomalies in favorite series: Alerts will only be triggered for anomalies that are added as favorites. 可以选择一组系列作为每个检测配置的偏好。You can choose a group of series as a favorite for each detecting config.

所有序列的前 N 个异常:将仅针对前 N 个序列中的异常触发警报。Anomalies in top N of all series: Alerts will only be triggered for anomalies in the top N series. 可以设置参数以指定要考虑的时间戳数,以及其中必须有多少异常才能发送警报。You can set parameters to specify the number of timestamps to take into account, and how many anomalies must be in them to send the alert.

严重性Severity

严重性是指指标顾问用于描述事件严重性的等级,其中包括“高”、“中”和“低” 。Severity is a grade that Metrics Advisor uses to describe the severity of incident, including High, Medium, and Low.

目前,指标顾问使用以下因素来衡量警报严重性:Currently, Metrics Advisor uses the following factors to measure the alert severity:

  1. 指标中异常的值比例和数量比例。The value proportion and the quantity proportion of anomalies in the metric.
  2. 异常的置信度。Confidence of anomalies.
  3. 你偏好的设置也会增加严重性。Your favorite settings also contribute to the severity.

自动推迟Auto snooze

某些异常是暂时性问题,尤其是对于小粒度指标。Some anomalies are transient issues, especially for small granularity metrics. 可以推迟特定数量的时间点的警报。You can snooze an alert for a specific number of time points. 如果在特定数量的点内发现异常,则不会触发任何警报。If anomalies are found within that specified number of points, no alert will be triggered. 自动推迟的行为可以在指标级别或序列级别上设置。The behavior of auto snooze can be set on either metric level or series level.

推迟的行为可以在指标级别或序列级别上设置。The behavior of snooze can be set on either metric level or series level.

数据馈送设置Data feed settings

引入时间偏移Ingestion Time Offset

默认情况下,根据粒度(如“每日”)对数据进行引入。By default, data is ingested according to the granularity (such as daily). 通过使用正整数,可以使用指定值延迟数据引入。By using a positive integer, you can delay ingestion of the data by the specified value. 通过使用负数,可以使用指定值提前引入数据。Using a negative number, you can advance the ingestion by the specified value.

每分钟最大引入Max Ingestion per Minute

如果数据源支持有限的并发,请设置此参数。Set this parameter if your data source supports limited concurrency. 否则保留默认设置。Otherwise leave the default settings.

多长时间后停止重试Stop retry after

如果数据引入失败,指标顾问将在一段时间后自动重试。If data ingestion has failed, Metrics Advisor will retry automatically after a period of time. 该时段的开始是发生第一次数据引入的时间。The beginning of the period is the time when the first data ingestion occurred. 重试时段的长度根据粒度定义。The length of the retry period is defined according to the granularity. 如果使用默认值 (-1),则将根据粒度确定重试时长:If you use the default value (-1), the retry period will be determined according to the granularity:

粒度Granularity 多长时间后停止重试Stop Retry After
每日、自定义(>= 1 天)、每周、每月、每年Daily, Custom (>= 1 Day), Weekly, Monthly, Yearly 7 天7 days
每小时、自定义(< 1 天)Hourly, Custom (< 1 Day) 72 小时72 hours

最小重试间隔Min retry interval

重试从源拉取数据时,可以指定最小间隔。You can specify the minimum interval when retrying to pull data from the source. 如果使用默认值 (-1),则将根据粒度确定重试间隔:If you use the default value (-1), the retry interval will be determined according to the granularity:

粒度Granularity 最小重试间隔Minimum Retry Interval
每日、自定义(>= 1 天)、每周、每月Daily, Custom (>= 1 Day), Weekly, Monthly 30 分钟30 minutes
每小时、自定义(< 1 天)Hourly, Custom (< 1 Day) 10 分钟10 minutes
每年Yearly 1 天1 day

宽限期Grace period

备注

宽限期从常规引入时间开始,再加上指定的引入时间偏移。The grace period begins at the regular ingestion time, plus specified ingestion time offset.

宽限期是指标顾问将继续从数据源提取数据,但不会触发任何警报的时间段。A grace period is a period of time where Metrics Advisor will continue fetching data from the data source, but won't fire any alerts. 如果在宽限期后未引入任何数据,则将触发“数据馈送不可用”警报。If no data was ingested after the grace period, a Data feed not available alert will be triggered.

推迟警报数量Snooze alerts in

当此选项设置为零时,每个不可用的时间戳都将触发警报。When this option is set to zero, each timestamp with Not Available will trigger an alert. 当设置为非零值时,如果未提取任何数据,将推迟指定数量的不可用警报。When set to a value other than zero, the specified number of Not available alerts will be snoozed if no data was fetched.

数据馈送权限Data feed permissions

有两个角色用于管理数据馈送权限:管理员和查看者 。There are two roles to manage data feed permissions: Administrator, and Viewer.

  • 管理员可以完全控制其中的数据馈送和指标。An Administrator has full control of the data feed and metrics within it. 他们可以激活、暂停、删除数据馈送,以及更新源和配置。They can activate, pause, delete the data feed, and make updates to feeds and configurations. 管理员通常是指标的所有者。An Administrator is typically the owner of the metrics.

  • 查看者可以查看数据馈赠或指标,但无法做出更改。A Viewer is able to view the data feed or metrics, but is not able to make changes.

后续步骤Next steps