在 Application Insights 中设置警报Set Alerts in Application Insights

当 Web 应用中的性能或使用情况指标发生变化时,Azure Application Insights 可发出警报。Azure Application Insights can alert you to changes in performance or usage metrics in your web app.

Application Insights 在各种平台上监视实时应用,帮助诊断性能问题和了解使用模式。Application Insights monitors your live app on a wide variety of platforms to help you diagnose performance issues and understand usage patterns.

有多种类型的警报:There are multiple types of alerts:

  • 指标警报:某一指标(例如响应时间、异常计数、CPU 使用率或页面视图)超过某个阈值有一些时间时,此类警报会发出通知。Metric alerts tell you when a metric crosses a threshold value for some period - such as response times, exception counts, CPU usage, or page views.
  • 日志警报用于描述警报,其中的警报信号基于自定义的 Kusto 查询。Log Alerts is used to describe alerts where the alert signal is based on a custom Kusto query.
  • Web 测试:当站点在 Internet 上不可用或响应缓慢时,它会向用户发送通知。Web tests tell you when your site is unavailable on the internet, or responding slowly. 了解详细信息Learn more.
  • 主动诊断:由系统自动配置,通知出现了异常的性能模式。Proactive diagnostics are configured automatically to notify you about unusual performance patterns.

设置指标警报Set a Metric alert

打开“警报规则”选项卡,并使用“添加”按钮。Open the Alert rules tab, and then use the add button.

在“警报规则”选项卡中选择“添加警报”。

  • 在其他属性之前设置资源。Set the resource before the other properties. 如果想要针对性能或用法指标设置警报,请选择“(组件)”资源Choose the "(components)" resource if you want to set alerts on performance or usage metrics.
  • 为警报指定的名称必须在资源组中(不只是在应用程序中)唯一。The name that you give to the alert must be unique within the resource group (not just your application).
  • 请仔细留意系统要求输入阈值所采用的单位。Be careful to note the units in which you're asked to enter the threshold value.
  • 如果选中“电子邮件所有者...”框,系统会通过电子邮件将警报发送到有权访问此资源组的每个人。If you check the box "Email owners...", alerts are sent by email to everyone who has access to this resource group. 要增加收件人的人数,请将他们添加到资源组或订阅(而不是资源)。To expand this set of people, add them to the resource group or subscription (not the resource).
  • 如果指定“其他电子邮件”,系统会将警报发送到这些个人或组(无论是否选中“电子邮件所有者...”框)。If you specify "Additional emails", alerts are sent to those individuals or groups (whether or not you checked the "email owners..." box).
  • 如果已设置响应警报的 Web 应用,请设置 Webhook 地址Set a webhook address if you have set up a web app that responds to alerts. 当警报激活时,以及警报得到解决时,系统会调用此地址。It is called both when the alert is Activated and when it is Resolved. (但请注意,查询参数不会以 Webhook 属性的形式传递。)(But note that at present, query parameters are not passed through as webhook properties.)
  • 可以禁用或启用警报:使用顶部的相应按钮。You can Disable or Enable the alert: see the buttons at the top.

我看不到“添加警报”按钮。I don't see the Add Alert button.

  • 使用的是组织帐户?Are you using an organizational account? 如果对此应用程序资源拥有所有者或参与者访问权限,则可以设置警报。You can set alerts if you have owner or contributor access to this application resource. 请查看“访问控制”选项卡。了解访问控制Take a look at the Access Control tab. Learn about access control.

备注

在“警报”边栏选项卡中,可以看到已设置了一个警报:主动诊断In the alerts blade, you see that there's already an alert set up: Proactive Diagnostics. 自动警报可监视一个特定的指标:请求失败率。The automatic alert monitors one particular metric, request failure rate. 除非决定禁用主动警报,否则无需针对请求失败率设置自己的警报。Unless you decide to disable the proactive alert, you don't need to set your own alert on request failure rate.

查看警报See your alerts

当警报在“非活动”和“活动”之间切换状态时,将收到电子邮件。You get an email when an alert changes state between inactive and active.

每个警报的当前状态显示在“警报规则”选项卡中。The current state of each alert is shown in the Alert rules tab.

警报下拉列表中包含最近活动的摘要:There's a summary of recent activity in the alerts drop-down:

警报下拉列表

状态更改历史记录在活动日志中提供:The history of state changes is in the Activity Log:

在“概述”选项卡中,单击“设置”、“审核日志”

警报的工作原理How alerts work

  • 警报具有三种状态:“从未激活”、“已激活”和“已解决”。An alert has three states: "Never activated", "Activated", and "Resolved." “已激活”表示指定的条件在上次评估时为 true。Activated means the condition you specified was true, when it was last evaluated.

  • 警报状态更改时,将生成通知。A notification is generated when an alert changes state. (如果警报条件在创建警报时已为 true,则在条件变为 false 之前,可能收不到通知。)(If the alert condition was already true when you created the alert, you might not get a notification until the condition goes false.)

  • 如果已选中“电子邮件”框或提供了电子邮件地址,每条通知都会生成一封电子邮件。Each notification generates an email if you checked the emails box, or provided email addresses. 也可以查看“通知”下拉列表。You can also look at the Notifications drop-down list.

  • 每次出现指标都会评估警报,否则不会评估。An alert is evaluated each time a metric arrives, but not otherwise.

  • 将评估聚合前一时间段的指标,然后将其与阈值进行比较,确定新状态。The evaluation aggregates the metric over the preceding period, and then compares it to the threshold to determine the new state.

  • 选择的时间段指定了聚合指标的间隔。The period that you choose specifies the interval over which metrics are aggregated. 它不影响评估警报的频率:评估频率取决于指标出现的频率。It doesn't affect how often the alert is evaluated: that depends on the frequency of arrival of metrics.

  • 如果在一段时间内特定指标的数据未出现,该间隙对警报评估以及指标资源管理器中的图表会产生不同的影响。If no data arrives for a particular metric for some time, the gap has different effects on alert evaluation and on the charts in metric explorer. 在指标资源管理器中,如果未看到数据的时间超过图表的采样间隔时间,图表会显示 0 值。In metric explorer, if no data is seen for longer than the chart's sampling interval, the chart shows a value of 0. 但是,基于相同指标的警报不会重新评估,并且警报的状态将保持不变。But an alert based on the same metric is not be reevaluated, and the alert's state remains unchanged.

    当数据最终出现时,图表将跳回到非零值。When data eventually arrives, the chart jumps back to a non-zero value. 警报根据指定时间段的可用数据进行评估。The alert evaluates based on the data available for the period you specified. 如果新数据点是该时间段内唯一可用的数据点,会根据该数据点聚合。If the new data point is the only one available in the period, the aggregate is based just on that data point.

  • 即使设置的时间段较长,警报也可能在警报与正常状态之间经常变动。An alert can flicker frequently between alert and healthy states, even if you set a long period. 如果指标值徘徊在阈值附近,则可能发生这种情况。This can happen if the metric value hovers around the threshold. 阈值中没有滞后:过渡到警报状态时的值与过渡到正常状态时的值相同。There is no hysteresis in the threshold: the transition to alert happens at the same value as the transition to healthy.

如何设置合理的警报?What are good alerts to set?

这取决于应用程序。It depends on your application. 一开始,最好不要设置太多指标。To start with, it's best not to set too many metrics. 花点时间查看应用运行时的指标图表,了解正常运行时的情况。Spend some time looking at your metric charts while your app is running, to get a feel for how it behaves normally. 此做法可帮助找到改善应用性能的方式。This practice helps you find ways to improve its performance. 然后再设置警报,在指标偏离正常区域时接收通知。Then set up alerts to tell you when the metrics go outside the normal zone.

常用的警报包括:Popular alerts include:

  • 浏览器指标(尤其是浏览器页面加载时间)非常适合用于 Web 应用程序。Browser metrics, especially Browser page load times, are good for web applications. 如果页面包含许多脚本,则应留意浏览器异常 。If your page has many scripts, you should look for browser exceptions. 若要获取这些指标和警报,必须设置网页监视In order to get these metrics and alerts, you have to set up web page monitoring.
  • 服务器响应时间适合用于 Web 应用程序的服务器端。Server response time for the server side of web applications. 还可以设置警报来注意此指标,确定高请求率是否不按比例变化:变化可能表示应用资源不足。As well as setting up alerts, keep an eye on this metric to see if it varies disproportionately with high request rates: variation might indicate that your app is running out of resources.
  • 服务器异常 - 若要查看这些异常,必须执行一些附加设置Server exceptions - to see them, you have to do some additional setup.

请记住,主动故障率诊断会自动监视应用响应请求的速率并提供故障代码。Don't forget that proactive failure rate diagnostics automatically monitor the rate at which your app responds to requests with failure codes.

本部分介绍如何设置基于查询的异常警报。In this section, we will go through how to set a query based exception alert. 本示例假设我们希望在过去 24 小时失败率超过 10% 时发出警报。For this example, let's say we want an alert when the failed rate is greater than 10% in the last 24 hours.

  1. 在 Azure 门户中转到你的 Application Insights 资源。Go to your Application Insight resource in the Azure portal.

  2. 在左侧的“配置”下单击“警报”。 On the left, under configure click on Alert.

    在左侧的“配置”下单击“警报”

  3. 在“警报”选项卡的顶部选择“新建警报规则”。 At the top of the alert tab select New alert rule.

    在“警报”选项卡的顶部单击“新建警报规则”

  4. 此时应会自动选择你的资源。Your resource should be auto selected. 若要设置条件,请单击“添加条件”。 To set a condition, click Add condition.

    单击“添加条件”

  5. 在“配置信号逻辑”选项卡中选择“自定义日志搜索”。 In the configure signal logic tab select Custom log search

    单击“自定义日志搜索”

  6. 在“自定义日志搜索”选项卡上的“搜索查询”框中输入查询。In the custom log search tab, enter your query in the "Search query" box. 本示例使用以下 Kusto 查询。For this example, we will use the below Kusto query.

    let percentthreshold = 10;
    let period = 24h;
    requests
    | where timestamp >ago(period)
    | summarize requestsCount = sum(itemCount)
    | project requestsCount, exceptionsCount = toscalar(exceptions | where timestamp >ago(period) | summarize sum(itemCount))
    | extend exceptionsRate = toreal(exceptionsCount)/toreal(requestsCount) * 100
    | where exceptionsRate > percentthreshold
    
    

    在搜索查询框中键入查询

    备注

    还可将这些步骤应用于其他类型的基于查询的警报。You can also apply these steps to other types of query-based alerts. 可在此 Kusto 入门文档或此 SQL to Kusto 速查表中详细了解 Kusto 查询语言You can learn more about the Kusto query language from this Kusto getting started doc or this SQL to Kusto cheat sheet

  7. 在“警报逻辑”下,选择逻辑是基于结果数还是指标度量值。Under "Alert logic", choose whether it's based on number of results or metric measurement. 然后选择条件(大于、等于、小于)和阈值。Then pick the condition (greater than, equal to, less than) and a threshold. 更改这些值时,可以注意到条件预览句子会发生变化。While you are changing these values, you may notice the condition preview sentence changes. 本示例使用“等于”。In this example we are using "equal to".

    在“警报逻辑”下,选择根据所选依据和条件提供的选项,然后键入阈值

  8. 在“评估依据”下,设置时段和频率。Under "Evaluated based on", set the period and frequency. 此处的时段必须与在上述查询中输入的时段值匹配。The period here must match the value that we put for period in the query above. 然后单击“完成”。 Then click done.

    在底部设置时段和频率,然后单击“完成”

  9. 现在,我们看到创建的条件和每月估计费用。We now see the condition we created with the estimated monthly cost. 在“操作组”下,可以创建新组,或选择现有的组。Below under "Action Groups" you can create a new group or select an existing one. 如果需要,可以自定义操作。If you want, you can customize the actions.

    单击“操作组”下的“选择”或“创建”按钮

  10. 最后,请添加警报详细信息(警报规则名称、说明、严重性)。Finally add your alert details (alert rule name, description, severity). 完成后,单击底部的“创建警报规则”。 When you are done, click Create alert rule at the bottom.

    在“警报详细信息”下键入警报规则名称,编写说明,然后选择严重性

如何取消订阅经典警报电子邮件通知How to unsubscribe from classic alert e-mail notifications

本部分适用于经典可用性警报经典 Application Insights 指标警报经典失败异常警报This section applies to classic availability alerts, classic Application Insights metric alerts, and to classic failure anomalies alerts.

如果存在以下任一情况,你收到这些经典警报的电子邮件通知:You are receiving e-mail notifications for these classic alerts if any of the following applies:

  • 你的电子邮件地址已列在警报规则设置中的“通知电子邮件收件人”字段内。Your e-mail address is listed in the Notification e-mail recipients field in the alert rule settings.

  • 用于将电子邮件通知发送到持有特定订阅角色的用户的选项已激活,而你正好持有该特定 Azure 订阅的相应角色。The option to send e-mail notifications to users holding certain roles for the subscription is activated, and you hold a respective role for that particular Azure subscription.

警报通知屏幕截图

为了更好地控制安全和隐私,我们一般建议在“通知电子邮件收件人”字段中显式指定经典警报的通知收件人。 To better control your security and privacy we generally recommend that you explicitly specify the notification recipients for your classic alerts in the Notification email recipients field. 用于向持有特定角色的所有用户发送通知的选项是为了实现后向兼容而提供的。The option to notify all users holding certain roles is provided for backward compatibility.

若要取消订阅警报规则生成的电子邮件通知,请从“通知电子邮件收件人”字段中删除你的电子邮件地址。 To unsubscribe from e-mail notifications generated by a certain alert rule, remove your e-mail address from the Notification email recipients field.

如果你的电子邮件地址未显式列出,我们建议禁用该选项以自动通知特定角色的所有成员,并在“警报规则收件人”字段中列出需要接收该警报规则的通知的所有用户电子邮件。If your email address is not listed explicitly, we recommend that you disable the option to notify all members of certain roles automatically, and instead list all user e-mails who need to receive notifications for that alert rule in the Notification e-mail recipients field.

谁会收到(经典)警报通知?Who receives the (classic) alert notifications?

本节仅适用于经典警报,并将帮助优化警报通知以确保只有预期的接收人能收到通知。This section only applies to classic alerts and will help you optimize your alert notifications to ensure that only your desired recipients receive notifications. 若要详细了解经典警报与新的警报体验之间的区别,请参阅警报概述文章To understand more about the difference between classic alerts and the new alerts experience, refer to the alerts overview article. 若要控制新的警报体验中的警报通知,请使用操作组To control alert notification in the new alerts experience, use action groups.

  • 建议将经典警报通知用于特定接收人。We recommend the use of specific recipients for classic alert notifications.

  • 对于有关任何 Application Insights 指标(包括可用性指标)的警报,如果启用“批/组”复选框选项,则会发送给订阅中具有所有者、参与者或读者角色的用户 。For alerts on any Application Insights metrics (including availability metrics), the bulk/group check-box option, if enabled, sends to users with owner, contributor, or reader roles in the subscription. 实际上,可以访问包含 Application Insights 资源在内的订阅的所有用户均会收到通知 。In effect, all users with access to the subscription the Application Insights resource are in scope and will receive notifications.

备注

如果当前使用“批/组”复选框选项并禁用它,则无法还原更改 。If you currently use the bulk/group check-box option, and disable it, you will not be able to revert the change.

如果需要根据用户角色通知用户,请使用新的警报体验/近实时警报。Use the new alert experience/near-realtime alerts if you need to notify users based on their roles. 使用操作组,可以为具有任何参与者/所有者/读者角色(未融合为单一选项)的用户配置电子邮件通知。With action groups, you can configure email notifications to users with any of the contributor/owner/reader roles (not combined together as a single option).

自动化Automation

另请参阅See also