检测数据集中的数据偏移(预览版)Detect data drift (preview) on datasets

适用于:是基本版是企业版               (升级到企业版APPLIES TO: yesBasic edition yesEnterprise edition                    (Upgrade to Enterprise edition)

重要

“检测数据集中的数据偏移”功能目前为公共预览版。Detecting data drift on datasets is currently in public preview. 该预览版在提供时没有附带服务级别协议,建议不要将其用于生产工作负载。The preview version is provided without a service level agreement, and it's not recommended for production workloads. 某些功能可能不受支持或者受限。Certain features might not be supported or might have constrained capabilities. 了解如何监视数据偏移并设置偏移幅度很大时的警报。Learn how to monitor data drift and set alerts when drift is high.

Azure 机器学习数据集监视器(预览版)具有以下功能:With Azure Machine Learning dataset monitors (preview), you can:

  • 分析数据的偏移,以了解数据在一段时间内的变化。Analyze drift in your data to understand how it changes over time.
  • 监视模型数据,以了解训练数据集与服务数据集之间的差异。Monitor model data for differences between training and serving datasets. 首先从部署的模型收集模型数据Start by collecting model data from deployed models.
  • 监视新数据,以了解任何基线与目标数据集之间的差异。Monitor new data for differences between any baseline and target dataset.
  • 分析数据中的特征,以跟踪统计属性在一段时间内的变化。Profile features in data to track how statistical properties change over time.
  • 针对数据偏移设置警报,以便针对潜在问题提前发出警告。Set up alerts on data drift for early warnings to potential issues.

使用 Azure 机器学习数据集来创建监视器。An Azure Machine learning dataset is used to create the monitor. 此数据集必须包含一个时间戳列。The dataset must include a timestamp column.

可以在 Python SDK 或 Azure 机器学习工作室中查看数据偏移指标。You can view data drift metrics with the Python SDK or in Azure Machine Learning studio. 可以通过与 Azure 机器学习工作区关联的 Azure Application Insights 资源获取其他指标和见解。Other metrics and insights are available through the Azure Application Insights resource associated with the Azure Machine Learning workspace.

重要

在所有版本中都可以通过 SDK 监视数据偏移。Monitoring data drift with the SDK is available in all editions. 不过,通过 Web 上的工作室监视数据偏移的功能仅在企业版中可用。However, monitoring data drift through the studio on the web is Enterprise edition only.

先决条件Prerequisites

若要创建和使用数据集监视器,需要:To create and work with dataset monitors, you need:

什么是数据偏移?What is data drift?

数据偏移是模型准确度不断下降的主要原因之一。Data drift is one of the top reasons model accuracy degrades over time. 对于机器学习模型,数据偏移是指模型输入数据的变化,这会导致模型性能下降。For machine learning models, data drift is the change in model input data that leads to model performance degradation. 监视数据偏移有助于检测这些模型性能问题。Monitoring data drift helps detect these model performance issues.

数据偏移的原因包括:Causes of data drift include:

  • 上游流程更改,例如,更换了传感器,使度量单位由英寸改为厘米。Upstream process changes, such as a sensor being replaced that changes the units of measurement from inches to centimeters.
  • 数据质量问题,例如,已损坏的传感器的读数始终为 0。Data quality issues, such as a broken sensor always reading 0.
  • 数据的自然偏移,例如,平均温度随着季节而变化。Natural drift in the data, such as mean temperature changing with the seasons.
  • 特征之间的关系变化,也称为共变偏移。Change in relation between features, or covariate shift.

Azure 机器学习通过计算单个指标来简化偏移检测,该指标将所比较数据集的复杂性抽象化。Azure Machine Learning simplifies drift detection by computing a single metric abstracting the complexity of datasets being compared. 这些数据集可能有数百个特征和数万个行。These datasets may have hundreds of features and tens of thousands of rows. 一旦检测到偏移,就可以通过向下钻取来了解哪些特征导致了偏移。Once drift is detected, you drill down into which features are causing the drift. 然后你可以检查特征级别指标,以调试和厘清偏移的根本原因。You then inspect feature level metrics to debug and isolate the root cause for the drift.

这种自上而下的方法可以轻松监视数据,不必使用传统的基于规则的方法。This top down approach makes it easy to monitor data instead of traditional rules-based techniques. 基于规则的方法(例如允许的数据范围或允许的唯一值)可能非常耗时且容易出错。Rules-based techniques such as allowed data range or allowed unique values can be time consuming and error prone.

在 Azure 机器学习中,我们使用数据集监视器进行数据偏移检测和报警。In Azure Machine Learning, you use dataset monitors to detect and alert for data drift.

数据集监视器Dataset monitors

数据集监视器的功能:With a dataset monitor you can:

  • 检测数据集中新数据的数据偏移并发出警报。Detect and alert to data drift on new data in a dataset.
  • 分析历史数据的偏移情况。Analyze historical data for drift.
  • 分析一段时间内的新数据。Profile new data over time.

数据偏移算法提供数据变化的整体度量,并指出需要对哪些特征做进一步的调查。The data drift algorithm provides an overall measure of change in data and indication of which features are responsible for further investigation. 数据集监视器通过分析 timeseries 数据集中的新数据来生成其他许多指标。Dataset monitors produce a number of other metrics by profiling new data in the timeseries dataset.

可以通过 Azure Application Insights 针对监视器生成的所有指标设置自定义警报。Custom alerting can be set up on all metrics generated by the monitor through Azure Application Insights. 数据集监视器可用于快速捕获数据问题,并通过识别可能的原因来减少调试问题所需的时间。Dataset monitors can be used to quickly catch data issues and reduce the time to debug the issue by identifying likely causes.

从概念上讲,在 Azure 机器学习中设置数据集监视器有三种主要方案。Conceptually, there are three primary scenarios for setting up dataset monitors in Azure Machine Learning.

方案Scenario 描述Description
监视模型的服务数据与训练数据之间的偏移Monitor a model's serving data for drift from the training data 由于服务数据与训练数据之间存在偏移时模型准确度下降,因此可以将此方案的结果解释为在代理中监视模型的准确度。Results from this scenario can be interpreted as monitoring a proxy for the model's accuracy, since model accuracy degrades when the serving data drifts from the training data.
监视时序数据集与前一个时间段之间的偏移。Monitor a time series dataset for drift from a previous time period. 此方案较为常见,可用于监视涉及到模型生成操作的上游或下游节点的数据集。This scenario is more general, and can be used to monitor datasets involved upstream or downstream of model building. 目标数据集必须有一个时间戳列。The target dataset must have a timestamp column. 基线数据集可以是任意表格数据集,其中包含与目标数据集共有的特征。The baseline dataset can be any tabular dataset that has features in common with the target dataset.
对过去的数据进行分析。Perform analysis on past data. 此方案可用于了解历史数据,并在数据集监视器的设置方面做出决策。This scenario can be used to understand historical data and inform decisions in settings for dataset monitors.

数据集监视器依赖于以下 Azure 服务。Dataset monitors depend on the following Azure services.

Azure 服务Azure service 描述Description
数据集Dataset 偏移使用机器学习数据集检索训练数据,并比较用于模型训练的数据。Drift uses Machine Learning datasets to retrieve training data and compare data for model training. 生成数据概要文件是为了生成一些报告指标,例如最小值、最大值、非重复值、非重复值计数。Generating profile of data is used to generate some of the reported metrics such as min, max, distinct values, distinct values count.
Azureml 管道和计算Azureml pipeline and compute 偏移计算作业托管在 azureml 管道中。The drift calculation job is hosted in azureml pipeline. 该作业按需或按计划触发,可以针对在创建偏移监视器时配置的计算运行。The job is triggered on demand or by schedule to run on a compute configured at drift monitor creation time.
Application insightsApplication insights 偏移会向属于机器学习工作区的 Application Insights 发出指标。Drift emits metrics to Application Insights belonging to the machine learning workspace.
Azure blob 存储Azure blob storage 偏移会向 Azure Blob 存储发出 JSON 格式的指标。Drift emits metrics in json format to Azure blob storage.

数据集如何监视数据How dataset monitors data

可以使用机器学习数据集监视数据偏移。Use Machine Learning datasets to monitor for data drift. 请指定一个基线数据集(通常为模型的训练数据集)。Specify a baseline dataset - usually the training dataset for a model. 将一段时间内的目标数据集(通常是模型输入数据)与基线数据集进行比较。A target dataset - usually model input data - is compared over time to your baseline dataset. 这种比较意味着必须为目标数据集指定一个时间戳列。This comparison means that your target dataset must have a timestamp column specified.

创建目标数据集Create target dataset

需要通过数据中的某个列或者派生自文件路径模式的某个虚拟列指定一个时间戳列,为目标数据集设置 timeseries 特征。The target dataset needs the timeseries trait set on it by specifying the timestamp column either from a column in the data or a virtual column derived from the path pattern of the files. 可通过 Python SDKAzure 机器学习工作室创建带时间戳的数据集。Create the dataset with a timestamp through the Python SDK or Azure Machine Learning studio. 必须指定表示“时间戳”的列,才能向数据集添加 timeseries 特征。A column representing a "timestamp" must be specified to add timeseries trait to the dataset. 如果数据已分区成包含时间信息的文件夹结构(例如“{yyyy/MM/dd}”),请通过路径模式设置来创建虚拟列,并将其设置为“分区时间戳”,以提高时序功能的重要性。If your data is partitioned into folder structure with time info, such as '{yyyy/MM/dd}', create a virtual column through the path pattern setting and set it as the "partition timestamp" to improve the importance of time series functionality.

Python SDKPython SDK

Dataset 类的 with_timestamp_columns() 方法定义数据集的时间戳列。The Dataset class with_timestamp_columns() method defines the time stamp column for the dataset.

from azureml.core import Workspace, Dataset, Datastore

# get workspace object
ws = Workspace.from_config()

# get datastore object 
dstore = Datastore.get(ws, 'your datastore name')

# specify datastore paths
dstore_paths = [(dstore, 'weather/data.parquet')]

# specify partition format
partition_format = 'weather/{state}/{date:yyyy/MM/dd}/data.parquet'

# create the Tabular dataset with 'state' and 'date' as virtual columns 
dset = Dataset.Tabular.from_parquet_files(path=dstore_paths, partition_format=partition_format)

# assign the timestamp attribute to a real or virtual column in the dataset
dset = dset.with_timestamp_columns('date')

# register the dataset as the target dataset
dset = dset.register(ws, 'target')

有关使用数据集的 timeseries 特征的完整示例,请参阅示例笔记本数据集 SDK 文档For a full example of using the timeseries trait of datasets, see the example notebook or the datasets SDK documentation.

Azure 机器学习工作室Azure Machine Learning studio

重要

此工作室 https://ml.azure.com 中的功能只能从企业工作区访问The functionality in this studio, https://ml.azure.com, is accessible from Enterprise workspaces only. 详细了解版本和升级Learn more about editions and upgrading.

如果使用 Azure 机器学习工作室创建数据集,请确保数据的路径包含时间戳信息(其中包括包含数据的所有子文件夹),并设置分区格式。If you create your dataset using Azure Machine Learning studio, ensure the path to your data contains timestamp information, include all subfolders with data, and set the partition format.

在以下示例中,采用了子文件夹 NoaaIsdFlorida/2019 下的所有数据,分区格式指定了时间戳的年、月和日。In the following example, all data under the subfolder NoaaIsdFlorida/2019 is taken, and the partition format specifies the timestamp's year, month, and day.

分区格式Partition format

在“架构”设置中,通过指定的数据集中的虚拟列或实际列指定时间戳列:In the Schema settings, specify the timestamp column from a virtual or real column in the specified dataset:

设置时间戳

如果按日期对数据分区(此处的示例就是如此),还可指定 partition_timestamp。If your data is partitioned by date, as is the case here, you can also specify the partition_timestamp. 这样可以更高效地处理日期。This allows more efficient processing of dates.

分区时间戳

创建数据集监视器Create dataset monitors

创建数据集监视器,以检测新数据集中的数据偏移并发出警报。Create dataset monitors to detect and alert to data drift on a new dataset. 使用 Python SDKAzure 机器学习工作室Use either the Python SDK or Azure Machine Learning studio.

Python SDKPython SDK

有关完整详细信息,请参阅有关数据偏移的 Python SDK 参考文档See the Python SDK reference documentation on data drift for full details.

以下示例演示如何使用 Python SDK 创建数据集监视器The following example shows how to create a dataset monitor using the Python SDK

from azureml.core import Workspace, Dataset
from azureml.datadrift import DataDriftDetector
from datetime import datetime

# get the workspace object
ws = Workspace.from_config()

# get the target dataset
dset = Dataset.get_by_name(ws, 'target')

# set the baseline dataset
baseline = target.time_before(datetime(2019, 2, 1))

# set up feature list
features = ['latitude', 'longitude', 'elevation', 'windAngle', 'windSpeed', 'temperature', 'snowDepth', 'stationName', 'countryOrRegion']

# set up data drift detector
monitor = DataDriftDetector.create_from_datasets(ws, 'drift-monitor', baseline, target, 
                                                      compute_target='cpu-cluster', 
                                                      frequency='Week', 
                                                      feature_list=None, 
                                                      drift_threshold=.6, 
                                                      latency=24)

# get data drift detector by name
monitor = DataDriftDetector.get_by_name(ws, 'drift-monitor')

# update data drift detector
monitor = monitor.update(feature_list=features)

# run a backfill for January through May
backfill1 = monitor.backfill(datetime(2019, 1, 1), datetime(2019, 5, 1))

# run a backfill for May through today
backfill1 = monitor.backfill(datetime(2019, 5, 1), datetime.today())

# disable the pipeline schedule for the data drift detector
monitor = monitor.disable_schedule()

# enable the pipeline schedule for the data drift detector
monitor = monitor.enable_schedule()

有关设置 timeseries 数据集和数据偏移检测器的完整示例,请参阅我们的示例笔记本For a full example of setting up a timeseries dataset and data drift detector, see our example notebook.

Azure 机器学习工作室Azure Machine Learning studio

重要

此工作室 https://ml.azure.com 中的功能只能从企业工作区访问The functionality in this studio, https://ml.azure.com, is accessible from Enterprise workspaces only. 详细了解版本和升级Learn more about editions and upgrading.

若要在数据集监视器中设置警报,要为其创建监视器的数据集所在的工作区必须具有企业版功能。To set up alerts on your dataset monitor, the workspace that contains the dataset you want to create a monitor for must have Enterprise edition capabilities.

确认工作区功能后,导航到工作室的主页,然后选择左侧的“数据集”选项卡。After the workspace functionality is confirmed, navigate to the studio's homepage and select the Datasets tab on the left. 选择“数据集监视器”。Select Dataset monitors.

监视器列表

单击“+创建监视器”按钮,然后单击“下一步”继续完成向导。 Click on the +Create monitor button and continue through the wizard by clicking Next.

创建监视器向导

  • 选择目标数据集Select target dataset. 目标数据集是一个表格数据集,在其中指定的时间戳列将用于分析数据偏移。The target dataset is a tabular dataset with timestamp column specified which will be analyzed for data drift. 目标数据集必须包含与基线数据集共有的特征,并且应该是要将新数据追加到的 timeseries 数据集。The target dataset must have features in common with the baseline dataset, and should be a timeseries dataset, which new data is appended to. 可以分析目标数据集中的历史数据,也可以监视新数据。Historical data in the target dataset can be analyzed, or new data can be monitored.

  • 选择基线数据集。Select baseline dataset. 选择在比较一段时间内的目标数据集时用作基线的表格数据集。Select the tabular dataset to be used as the baseline for comparison of the target dataset over time. 基线数据集必须包含与目标数据集共有的特征。The baseline dataset must have features in common with the target dataset. 选择一个使用目标数据集切片的时间范围,或指定一个可用作基线的单独数据集。Select a time range to use a slice of the target dataset, or specify a separate dataset to use as the baseline.

  • 监视器设置Monitor settings. 这些设置适用于要创建的计划数据集监视管道。These settings are for the scheduled dataset monitor pipeline, which will be created.

    设置Setting 说明Description 提示Tips 可变Mutable
    名称Name 数据集监视器的名称。Name of the dataset monitor. No
    功能Features 要在其中分析一段时间内的数据偏移的特征列表。List of features that will be analyzed for data drift over time. 设置为模型的输出特征,以度量概念偏移。Set to a model's output feature(s) to measure concept drift. 不要包含一段时间内会自然偏移的特征(月、年、索引等)。Don't include features that naturally drift over time (month, year, index, etc.). 调整特征列表后,可以回填现有的数据偏移监视器。You can backfill and existing data drift monitor after adjusting the list of features. Yes
    计算目标Compute target 用于运行数据集监视作业的 Azure 机器学习计算目标。Azure Machine Learning compute target to run the dataset monitor jobs. Yes
    启用Enable 在数据集监视管道中启用或禁用计划Enable or disable the schedule on the dataset monitor pipeline 禁用计划以使用回填设置分析历史数据。Disable the schedule to analyze historical data with the backfill setting. 可以在创建数据集监视器后启用此设置。It can be enabled after the dataset monitor is created. Yes
    频率Frequency 用于计划管道作业以及分析历史数据(如果运行回填)的频率。The frequency that will be used to schedule the pipeline job and analyze historical data if running a backfill. 选项包括每日、每周或每月。Options include daily, weekly, or monthly. 每次运行都会根据频率比较目标数据集中的数据:Each run compares data in the target dataset according to the frequency:
  • 每日:将目标数据集中最新的完整日与基线进行比较Daily: Compare most recent complete day in target dataset with baseline
  • 每周:将目标数据集中最新的完整周(星期一 - 星期日)与基线进行比较Weekly: Compare most recent complete week (Monday - Sunday) in target dataset with baseline
  • 每月:将目标数据集中最新的完整月与基线进行比较Monthly: Compare most recent complete month in target dataset with baseline
  • No
    延迟Latency 数据进入数据集所需的时间(以小时为单位)。Time, in hours, it takes for data to arrive in the dataset. 例如,如果数据花费 3 天时间进入数据集封装的 SQL 数据库,则将滞后时间设置为 72。For instance, if it takes three days for data to arrive in the SQL DB the dataset encapsulates, set the latency to 72. 创建数据集监视器后无法更改Cannot be changed after the dataset monitor is created No
    电子邮件地址Email addresses 在违反数据偏移百分比阈值时用于发出警报的电子邮件地址。Email addresses for alerting based on breach of the data drift percentage threshold. 电子邮件将通过 Azure Monitor 发送。Emails are sent through Azure Monitor. Yes
    阈值Threshold 发出电子邮件警报之前所要达到的数据偏移百分比阈值。Data drift percentage threshold for email alerting. 可以在工作区关联的 Application Insights 资源中针对其他许多指标设置警报和事件。Further alerts and events can be set on many other metrics in the workspace's associated Application Insights resource. Yes

完成向导后,生成的数据集监视器会显示在列表中。After finishing the wizard, the resulting dataset monitor will appear in the list. 选择该监视器转到其详细信息页。Select it to go to that monitor's details page.

了解数据偏移结果Understand data drift results

本部分说明了数据集监视结果,这些结果可在 Azure 工作室中的“数据集 / 数据集监视器”页中找到。 This section shows you the results of monitoring a dataset, found in the Datasets / Dataset monitors page in Azure studio. 你可以在此页上更新设置以及分析特定时间段内的现有数据。You can update the settings as well as analyze existing data for a specific time period on this page.

首先大致了解数据偏移幅度,并突出显示要进一步调查的特征。Start with the top-level insights into the magnitude of data drift and a highlight of features to be further investigated.

偏移概述

指标Metric 描述Description
数据偏移幅度Data drift magnitude 一段时间内基线与目标数据集之间的偏移百分比。A percentage of drift between the baseline and target dataset over time. 范围为 0 到 100,0 表示数据集相同,100 表示 Azure 机器学习数据偏移模型可以完全区分两个数据集。Ranging from 0 to 100, 0 indicates identical datasets and 100 indicates the Azure Machine Learning data drift model can completely tell the two datasets apart. 由于这种幅度是使用机器学习技术生成的,预期度量的精确百分比中存在干扰。Noise in the precise percentage measured is expected due to machine learning techniques being used to generate this magnitude.
常见偏移特征Top drifting features 显示数据集中因偏移最大而对“偏移幅度”指标造成最大影响的特征。Shows the features from the dataset that have drifted the most and are therefore contributing the most to the Drift Magnitude metric. 由于共变偏移,特征的基础分布不一定需要改变即可获得相对较高的特征重要性。Due to covariate shift, the underlying distribution of a feature does not necessarily need to change to have relatively high feature importance.
阈值Threshold 数据偏移幅度超出设定阈值就会触发警报。Data Drift magnitude beyond the set threshold will trigger alerts. 可在监视器设置中对其进行配置。This can be configured in the monitor settings.

偏移幅度趋势Drift magnitude trend

查看数据集与目标数据集在指定时段内的差异。See how the dataset differs from the target dataset in the specified time period. 越接近 100%,两个数据集的差异越大。The closer to 100%, the more the two datasets differ.

偏移幅度趋势

偏移幅度(按特征)Drift magnitude by features

此部分包含对所选特征的分布变化的特征级见解,以及一段时间内的其他统计信息。This section contains feature-level insights into the change in the selected feature's distribution, as well as other statistics, over time.

此外,将分析一段时间内的目标数据集。The target dataset is also profiled over time. 将把一段时间内每个特征的基线分布之间的统计距离与目标数据集的相应距离进行比较。The statistical distance between the baseline distribution of each feature is compared with the target dataset's over time. 这在概念上类似于数据偏移幅度。Conceptually, this is similar to the data drift magnitude. 但是,此统计距离适用于单个特征而非所有特征。However this statistical distance is for an individual feature rather than all features. 还可以使用最小值、最大值和平均值。Min, max, and mean are also available.

在 Azure 机器学习工作室中,单击图中的某个条形可查看该日期的特征级详细信息。In the Azure Machine Learning studio, click on a bar in the graph to see the the feature level details for that date. 默认情况下,可以看到基线数据集的分布,以及同一特征的最近运行的分布。By default, you see the baseline dataset's distribution and the most recent run's distribution of the same feature.

偏移幅度(按特征)

也可以在 Python SDK 中通过对 DataDriftDetector 对象运行 get_metrics() 方法检索这些指标。These metrics can also be retrieved in the Python SDK through the get_metrics() method on a DataDriftDetector object.

特征详细信息Feature details

最后,可通过向下滚动来查看每个单独特征的详细信息。Finally, scroll down to view details for each individual feature. 可使用图表上方的下拉列表选择特征,并另外选择要查看的指标。Use the dropdowns above the chart to select the feature, and additionally select the metric you want to view.

数值特征图和比较

图表中的指标取决于特征的类型。Metrics in the chart depend on the type of feature.

  • 数字特征Numeric features | 指标Metric | 说明Description |
    | ------ | ----------- |
    | Wasserstein 距离Wasserstein distance | 将基线分布转换为目标分布的最小工作量。Minimum amount of work to transform baseline distribution into the target distribution. | | 平均值Mean value | 特征的平均值。Average value of the feature. | | 最小值Min value | 特征的最小值。Minimum value of the feature. | | 最大值Max value | 特征的最大值。Maximum value of the feature. |

  • 分类特征Categorical features

    指标Metric 描述Description
    欧氏距离Euclidian�distance����� 针对分类列进行计算。欧氏距离是基于两个矢量计算的,这两个矢量是根据两个数据集中同一分类列的经验分布生成的。0 表示经验分布没有差别。与 0 的偏差越大,该列的偏移程度越大。对此指标进行时序绘图即可观察相关趋势,并可利用这些趋势来发现偏移特征。��Computed�for�categorical�columns.�Euclidean�distance�is�computed�on�two�vectors,�generated�from�empirical�distribution�of�the�same�categorical�column�from�two�datasets.�0�indicates�there�is�no�difference�in�the�empirical�distributions.��The�more�it�deviates�from�0,�the�more�this�column�has�drifted.�Trends�can�be�observed�from�a�time�series�plot�of�this�metric�and�can�be�helpful�in�uncovering�a�drifting�feature.��
    唯一值Unique values 特征的唯一值(基数)数目。Number of unique values (cardinality) of the feature.

在此图表中,可以选择单个日期来比较目标与所显示特征的此日期之间的特征分布。On this chart, select a single date to compare the feature distribution between the target and this date for the displayed feature. 对于数值特征,这会显示两个概率分布。For numeric features, this shows two probability distributions. 如果特征为数值,则显示条形图。If the feature is numeric, a bar chart is shown.

选择一个与目标比较的日期

指标、警报和事件Metrics, alerts, and events

可以在与机器学习工作区关联的 Azure Application Insights 资源中查询指标。Metrics can be queried in the Azure Application Insights resource associated with your machine learning workspace. 可以访问 Application Insights 的所有功能,包括设置自定义警报规则和操作组,以触发电子邮件/短信/推送/语音或 Azure 函数等操作。Which gives access to all features of Application Insights including set up for custom alert rules and action groups to trigger an action such as, an Email/SMS/Push/Voice or Azure Function. 有关详细信息,请参阅完整的 Application Insights 文档。Please refer to the complete Application Insights documentation for details.

若要开始,请导航到 Azure 门户并选择工作区的“概览”页。To get started, navigate to the Azure portal and select your workspace's Overview page. 关联的 Application Insights 资源位于最右侧:The associated Application Insights resource is on the far right:

Azure 门户概述Azure portal overview

在左侧窗格中选择“监视”下的“日志(分析)”:Select Logs (Analytics) under Monitoring on the left pane:

Application Insights 概述

数据集监视器指标存储为 customMetricsThe dataset monitor metrics are stored as customMetrics. 可以在设置数据集监视器之后编写和运行查询来查看指标:You can write and run a query after setting up a dataset monitor to view them:

Log Analytics 查询Log analytics query

识别要对其设置警报规则的指标后,创建新的警报规则:After identifying metrics to set up alert rules, create a new alert rule:

新建警报规则

可以使用现有操作组或创建一个新操作组来定义满足设置的条件时要执行的操作:You can use an existing action group, or create a new one to define the action to be taken when the set conditions are met:

新建操作组

后续步骤Next steps