操作说明:使用指标顾问诊断事件How-to: Diagnose an incident using Metrics Advisor

指标顾问提供了多种诊断功能、对检测到的事件的深入见解,以及根本原因分析。Metrics Advisor provides several features for diagnostics, and gives an in-depth view of detected incidents, and provide root-cause analysis. 在指标上检测到一组异常时,指标顾问会将异常分组为一个层次结构并以此为基础进行分析。When a group of anomalies detected on a metric, Metrics Advisor will group anomalies into a hierarchy and analyze on top of it.

备注

当前,指标顾问支持对至少具有一个维度的指标进行事件诊断,并使用数字类型进行衡量。Currently Metrics Advisor supports incident diagnostics for metrics with at least one dimension, and measure with the numeric type. 指标需要具有每个维度的聚合维度值(如 SUM),用于构建诊断层次结构。Your metric needs to have an aggregated dimension value like SUM for each dimension, which is used to build the diagnostics hierarchy. 指标顾问提供自动汇总设置来帮助生成聚合值。Metrics Advisor offers Automatic roll up settings to help with generating aggregated values.

单击左侧导航窗口中的“事件中心”,查看给定指标下的所有事件。Click on Incident hub in the left navigation window to see all incidents under a given metric. 在页面顶部,可以选择不同的指标来查看其检测配置、检测结果以及更改时间范围。At the top of the page, you can select different metrics to see their detection configurations, and detection results, and change the time range.

提示

还可以通过以下方式访问事件中心:You can also get to the Incident hub by:

  • 在指标的可视化效果中单击数据点,然后使用显示的“添加反馈”窗口底部的链接。Clicking on a data point in the visualization for your metric, and using the links at the bottom of the Add feedback window that appears.
  • 在指标的“事件”选项卡中,单击其中一个异常。Clicking on one of the anomalies in the incidents tab for your metric.

“概述”部分包含检测结果,包括所选时间范围内的异常和警报计数。The overview section contains detection results, including counts of the anomalies and alerts within in the selected time range.

事件中心

事件列表中列出了在所选指标和时间范围内检测到的事件。Detected incidents within the selected metric and time range are listed in the Incident list . 一些选项可用于筛选事件和对其进行排序。There are options to filter and order the incidents. 例如,按严重性。For example, by severity. 单击其中某个事件以转到“事件”页面,以进行进一步诊断。Click on one of the incidents to go to the Incident page for further diagnostics.

事件列表

通过“诊断”部分,可对事件进行深入分析,并使用工具来确定根本原因。The Diagnostic section lets you perform in-depth analysis on an incident, and tools to identify root-causes.

诊断事件

根本原因建议Root cause advice

当指标中检测到一组异常并导致事件时,指标顾问将尝试分析该事件的根本原因。When a group of anomalies is detected in a metric and causes an incident, Metrics Advisor will try to analyze the root cause of the incident. 根本原因建议为事件的可能原因提供自动建议。Root cause advice provides automatic suggestions for likely causes of an incident. 此功能仅在维度中存在聚合值时可用。This feature is only available if there is an aggregated value within dimension. 如果指标没有维度,则根本原因在于指标。If the metric has no dimension, the root cause will be itself. 根本原因列在右侧面板中,并且可能列出了多种原因。Root causes are listed at right side panel and there might be several reasons listed. 如果表中没有数据,则表示维度不满足执行分析的要求。If there is no data in the table, it means your dimension doesn't satisfy the requirements to perform the analysis.

根本原因建议

为根本原因指标提供了特定维度时,可以单击“转到指标”,查看该指标的更多详细信息。When the root cause metric is provided with specific dimensions, you can click go to metric to view more details of the metric.

事件树Incident tree

指标顾问不仅支持自动分析潜在的根本原因,还支持使用事件树进行手动根本原因分析。Along with automated analysis on potential root causes, Metrics Advisor supports manual root cause analysis, using the Incident Tree . 事件页面中有两种事件树:快速诊断树和交互式树 。There are two kinds of incident tree in incident page: the quick diagnose tree, and the interactive tree .

快速诊断树用于诊断当前事件,根节点仅限于当前事件根节点。The quick diagnosis tree is for diagnosing a current incident, and the root node is limited to current incident root node. 可以通过单击树节点来将其展开和折叠,其序列将与当前事件序列一起显示在树上方的图表中。You can expand and collapse the tree nodes by clicking on it, and its series will be shown together with the current incident series in the chart above the tree.

通过交互式树,可诊断当前事件、旧事件以及相关事件。The interactive tree lets you diagnose current incidents as well as older incidents, and ones that are related. 使用交互式树时,右键单击节点以打开操作菜单,你可以在该菜单中选择要在根节点中向上钻取的维度,以及要为每个节点向下钻取的维度。When using the interactive tree, right click on a node to open an action menu, where you can choose a dimension to drill up through the root nodes, and a dimension to drill down for each node. 通过单击顶部维度列表的“取消”按钮,可以从该维度删除向上钻取或向下钻取。By clicking on the cancel button of the dimension list on the top, you can remove the drilling up or down from this dimension. 左键单击某个节点,将其选中,并在图表中显示其序列以及当前事件序列。left click a node to select it and show its series together with current incident series in the chart.

事件树

异常向下钻取Anomaly drill down

查看事件信息时,可能需要获取更详细的信息,例如有关不同维度和时间戳的信息。When you're viewing incident information, you may need to get more detailed information, for example, for different dimensions, and timestamps. 如果数据具有一个或多个维度,则可以使用向下钻取功能获得更详细的视图。If your data has one or more dimensions, you can use the drill down function to get a more detailed view.

若要使用向下钻取功能,请单击“事件中心”中的“指标钻取”选项卡 。To use the drill down function, click on the Metric drilling tab in the Incident hub .

指标钻取

“维度”设置是事件的维度列表,你可以为每个事件选择其他可用的维度值。The Dimensions setting is a list of dimensions for an incident, you can select other available dimension values for each one. 维度值更改后。After the dimension values are changed. 通过“时间戳”设置即可在不同时间段查看当前事件。The Timestamp setting lets you view the current incident at different moments in time.

选择钻取选项并选择维度Select drilling options and choose a dimension

向下钻取选项包含两种类型:“向下钻取”和“水平比较” 。There are two types of drill down options: Drill down and Horizontal comparison .

备注

  1. 对于向下钻取,可以从不同的维度值(当前所选维度除外)浏览数据。For drill down, you can explore the data from different dimension values, except the currenly selected dimensions.
  2. 对于水平比较,可以从不同的维度值(所有向上的维度除外)浏览数据。For horizontal comparison, you can explore the data from different dimension values, except the all-up dimensions.

向下钻取维度

不同维度值的值比较Value comparison for different dimension values

“向下钻取”选项卡的第二部分是一个表格,其中包含不同维度值的比较情况。The second section of the drill down tab is a table with comparisons for different dimension values. 其中包括值、基线值、差值、增量值以及这是否为异常。It includes the value, baseline value, difference value, delta value and whether it is an anomaly.

向下钻取比较

不同维度值的值和期望值比较Value and expected value comparisons for different dimension value

“向下钻取”选项卡的第三部分是一个直方图,其中包含不同维度值的值和期望值。The third section of the drill down tab is an histogram with the values and expected values, for different dimension values. 直方图按值与期望值之差排序。The histogram is sorted by the difference between value and expected value. 可以轻松找到影响最大的意外值。You can find the unexpected value with the biggest impact easily. 例如,在上图中,我们可以发现,除了所有向上的值之外,US7 对异常的影响最大。For example, in the above picture, we can find that, except the all up value, US7 contributes the most for the anomaly.

向下钻取表

原始值可视化效果Raw value visualization

“向下钻取”选项卡的最后一部分是原始值的折线图。The last part of drill down tab is a line chart of the raw values. 提供此图表后,无需导航到指标页面即可查看详细信息。With this chart provided, you don't need to navigate to the metric page to view details.

向下钻取折线图

使用时序聚类分析查看类似异常View similar anomalies using Time Series Clustering

查看事件时,可以使用“类似时序聚类分析”选项卡来查看与事件相关的各种序列。When viewing an incident, you can use the Similar time-series-clustering tab to see the various series associated with it. 一组中的序列汇总在一起。Series in one group are summarized together. 从上图可以看出,至少有两个序列组。From the above picture, we can know that there is at least two series groups. 此功能仅在满足以下要求时可用:This feature is only available if the following requirements are met:

  1. 指标必须具有一个或多个维度/维度值。Metrics must have one or more dimensions or dimension values.
  2. 一个指标内的序列必须具有相似的趋势。The series within one metric must have a similar trend.

选项卡顶部列出了可用维度,你可以从中选择一个来指定序列。Available dimensions are listed on the top the the tab, and you can make a selection to specify the series.

序列组

比较时序Compare time series

有时,在特定时序上检测到异常时,可将其与单个可视化效果中的多个其他序列进行比较,这十分有帮助。Sometimes when an anomaly is detected on a specific time series, it's helpful to compare it with multiple other series in a single visualization. 单击“比较工具”选项卡,再单击蓝色的“+ 添加”按钮 。Click on the Compare tools tab, and then click on the blue + Add button.

添加要比较的序列

从数据馈送中选择一个序列。Select a series from your data feed. 可以选择同一种粒度,也可以选择其他粒度。You can choose the same granularity or a different one. 选择目标维度并加载序列趋势,然后单击“确定”,将其与以前的序列进行比较。Select the target dimensions and load the series trend, then click Ok to compare it with a previous series. 此序列将一起放在一个可视化效果中。The series will be put together in one visualization. 可以继续添加更多序列以进行比较,并获得更多见解。You can continue to add more series for comparison and get further insights. 单击“比较工具”选项卡顶部的下拉菜单,比较一段时间内的时序数据。Click the drop down menu at the top of the Compare tools tab to compare the time series data over a time-shifted period.

警告

若要进行比较,时序数据分析可能需要移动数据点,因此数据的粒度必须支持这一点。To make a comparison, time series data analysis may require shifts in data points so the granularity of your data must support it. 例如,如果数据是每周数据,你使用“日同比”进行比较,则不会获得任何结果。For example, if your data is weekly and you use the Day over day comparison, you will get no results. 在本例中,你将改为使用“月同比”进行比较。In this example, you would use the Month over month comparison instead.

选择时移比较后,可以选择是要比较数据值、增量值还是百分比增量。After selecting a time-shifted comparison, you can select whether you want to compare the data values, the delta values, or the percentage delta.

备注

  • 数据值是原始数据值。Data value is the raw data value.
  • 增量值是原始值与比较值之差。Delta value is the difference between raw value and compared value.
  • 百分比增量值是原始值与比较值之差除以比较值得到的值。Percentage delta value is the difference between raw value and compared value divided by compared value.

有时可能需要同时检查不同指标的事件或其他指标中的相关事件。Sometimes you may need to check the incidents of different metrics at the same time, or related incidents in other metrics. 可以在“跨指标分析”部分中找到相关事件的列表。You can find a list of related incidents in the Cross Metrics Analysis section.

指标之间的相关事件

需要在指标之间添加关系,然后才能查看当前指标的相关事件。Before you can see related incidents for current metric, you need to add a relationship between metrics. 单击“指标图设置”添加关系。Click Metrics Graph Settings to add a relationship. 只有维度名称相同的指标才能关联。Only metrics with same dimension names can be related. 使用以下参数。Use the following parameters.

  • 当前数据馈送和指标:当前事件的数据馈送和指标Current Data feed & Metric: the data feed and metric of current incident
  • 方向:两个指标之间的关系方向。Direction: the direction of relationship between two metrics. (暂时不影响相关事件列表)(not effect to related incidents list now)
  • 其他数据馈送和指标:与当前指标关联的数据馈送和指标Another Data feed & Metric : the data feed and metric to connect with current metric

后续步骤Next steps