将 Log Analytics 保存的搜索和警报添加到管理解决方案(预览版)Adding Log Analytics saved searches and alerts to management solution (Preview)

备注

这是用于创建当前处于预览版的管理解决方案的初步文档。This is preliminary documentation for creating management solutions which are currently in preview. 如下所述的全部架构均会有变动。Any schema described below is subject to change.

管理解决方案通常会将 Log Analytics 中保存的搜索包括在内,以便分析解决方案收集的数据。Management solutions will typically include saved searches in Log Analytics to analyze data collected by the solution. 它们可能还会定义警报,从而向用户发出通知或针对严重问题自动采取行动。They may also define alerts to notify the user or automatically take action in response to a critical issue. 本文介绍如何在资源管理模板中定义 Log Analytics 保存的搜索和警报,以便将其纳入管理解决方案This article describes how to define Log Analytics saved searches and alerts in a Resource Management template so they can be included in management solutions.

备注

本文中的示例使用管理解决方案所需或通用的参数和变量,在 Azure 中设计和构建管理解决方案中对它们进行了介绍The samples in this article use parameters and variables that are either required or common to management solutions and described in Design and build a management solution in Azure

先决条件Prerequisites

本文假设你已经熟悉如何创建管理解决方案以及资源管理器模板和解决方案文件的结构。This article assumes that you're already familiar with how to create a management solution and the structure of a Resource Manager template and solution file.

Log Analytics 工作区Log Analytics Workspace

Log Analytics 中的所有资源都包含在工作区中。All resources in Log Analytics are contained in a workspace. Log Analytics 工作区和自动化帐户中所述,工作区不包括在管理解决方案中,但必须存在才可以安装解决方案。As described in Log Analytics workspace and Automation account, the workspace isn't included in the management solution but must exist before the solution is installed. 如果不存在工作区,解决方案安装将失败。If it isn't available, then the solution install fails.

工作区的名称包含在每个 Log Analytics 资源的名称中。The name of the workspace is in the name of each Log Analytics resource. 这是在具有 workspace 参数的解决方案中完成的,如以下 SavedSearch 资源示例所示。This is done in the solution with the workspace parameter as in the following example of a SavedSearch resource.

"name": "[concat(parameters('workspaceName'), '/', variables('SavedSearchId'))]"

Log Analytics API 版本Log Analytics API version

资源管理器模板中定义的所有 Log Analytics 资源均包含 apiVersion 属性,该属性将定义资源应使用的 API 版本。All Log Analytics resources defined in a Resource Manager template have a property apiVersion that defines the version of the API the resource should use.

下表列出了此示例中使用的资源的 API 版本。The following table lists the API version for the resource used in this example.

资源类型Resource type API 版本API version 查询Query
savedSearchessavedSearches 2017-03-15-preview2017-03-15-preview Event | where EventLevelName == "Error"Event | where EventLevelName == "Error"

保存的搜索Saved Searches

保存的搜索纳入解决方案后,用户可查询由解决方案收集的数据。Include saved searches in a solution to allow users to query data collected by your solution. 保存的搜索将在 Azure 门户的“保存的搜索” 下显示。Saved searches appear under Saved Searches in the Azure portal. 每个警报也需要一个保存的搜索。A saved search is also required for each alert.

Log Analytics 保存的搜索资源的类型为 Microsoft.OperationalInsights/workspaces/savedSearches 且具有以下结构。Log Analytics saved search resources have a type of Microsoft.OperationalInsights/workspaces/savedSearches and have the following structure. 这包括常见变量和参数,以便可以将此代码片段复制并粘贴到解决方案文件,并更改参数名称。This includes common variables and parameters so that you can copy and paste this code snippet into your solution file and change the parameter names.

{
    "name": "[concat(parameters('workspaceName'), '/', variables('SavedSearch').Name)]",
    "type": "Microsoft.OperationalInsights/workspaces/savedSearches",
    "apiVersion": "[variables('LogAnalyticsApiVersion')]",
    "dependsOn": [
    ],
    "tags": { },
    "properties": {
        "etag": "*",
        "query": "[variables('SavedSearch').Query]",
        "displayName": "[variables('SavedSearch').DisplayName]",
        "category": "[variables('SavedSearch').Category]"
    }
}

下表介绍了保存的搜索的各个属性。Each property of a saved search is described in the following table.

属性Property 说明Description
categorycategory 保存的搜索的类别。The category for the saved search. 同一解决方案中所有保存的搜索常共享一个类别,因此他们在控制台中组合在一起。Any saved searches in the same solution will often share a single category so they are grouped together in the console.
displaynamedisplayname 保存的搜索在门户中显示的名称。Name to display for the saved search in the portal.
查询query 要运行的查询。Query to run.

备注

如果查询中包含可解释为 JSON 的字符,则可能需要在查询中使用转义字符。You may need to use escape characters in the query if it includes characters that could be interpreted as JSON. 例如,如果查询为 AzureActivity | OperationName:"Microsoft.Compute/virtualMachines/write",应在解决方案文件中将它编写为 AzureActivity | OperationName:/"Microsoft.Compute/virtualMachines/write"。For example, if your query was AzureActivity | OperationName:"Microsoft.Compute/virtualMachines/write", it should be written in the solution file as AzureActivity | OperationName:/"Microsoft.Compute/virtualMachines/write".

警报Alerts

Azure 日志警报是由定期运行指定日志查询的 Azure 警报规则创建的。Azure Log alerts are created by Azure Alert rules that run specified log queries at regular intervals. 如果查询结果与指定的条件相符,则会创建一个警报记录,并且会使用操作组运行一个或多个操作。If the results of the query match specified criteria, an alert record is created and one or more actions are run using Action Groups.

对于将警报扩展到 Azure 的用户,现在可以在 Azure 操作组中控制操作。For users that extend alerts to Azure, actions are now controlled in Azure action groups. 当工作区及其警报扩展到 Azure 后,可以使用操作组 - Azure 资源管理器模板检索或添加操作。When a workspace and its alerts are extended to Azure, you can retrieve or add actions by using the Action Group - Azure Resource Manager Template. 旧管理解决方案中的警报规则由以下三种不同资源组成。Alert rules in legacy management solution are made up of the following three different resources.

  • 保存的搜索。Saved search. 定义运行的日志搜索。Defines the log search that is run. 多个警报规则可共享一个保存的搜索。Multiple alert rules can share a single saved search.
  • 计划。Schedule. 定义运行日志搜索的频率。Defines how often the log search is run. 每个警报规则有且仅有一个计划。Each alert rule has one and only one schedule.
  • 警报操作。Alert action. 每个警报规则都具有一个类型为“Alert”的操作组资源或操作资源(旧版),它可定义警报的详细信息,例如定义创建警报记录的时间和警报严重性等条件。Each alert rule has one action group resource or action resource (legacy) with a type of Alert that defines the details of the alert such as the criteria for when an alert record is created and the alert's severity. 操作组资源可提供列出触发警报时可采取的配置操作的列表,例如:语音呼叫、短信、电子邮件、Webhook、ITSM 工具、自动化 runbook、逻辑应用等。Action group resource can have a list of configured actions to take when alert is fired - such as voice call, SMS, email, webhook, ITSM tool, automation runbook, logic app, etc.

前面描述了保存的搜索资源。Saved search resources are described above. 下面会介绍其他资源。The other resources are described below.

计划资源Schedule resource

保存的搜索可以拥有一个或多个计划,每个计划代表一个单独的警报规则。A saved search can have one or more schedules with each schedule representing a separate alert rule. 计划定义搜索的运行频率和检索数据的时间间隔。The schedule defines how often the search is run and the time interval over which the data is retrieved. 计划资源的类型为 Microsoft.OperationalInsights/workspaces/savedSearches/schedules/ 且具有以下结构。Schedule resources have a type of Microsoft.OperationalInsights/workspaces/savedSearches/schedules/ and have the following structure. 这包括常见变量和参数,以便可以将此代码片段复制并粘贴到解决方案文件,并更改参数名称。This includes common variables and parameters so that you can copy and paste this code snippet into your solution file and change the parameter names.

{
    "name": "[concat(parameters('workspaceName'), '/', variables('SavedSearch').Name, '/', variables('Schedule').Name)]",
    "type": "Microsoft.OperationalInsights/workspaces/savedSearches/schedules/",
    "apiVersion": "[variables('LogAnalyticsApiVersion')]",
    "dependsOn": [
        "[concat('Microsoft.OperationalInsights/workspaces/', parameters('workspaceName'), '/savedSearches/', variables('SavedSearch').Name)]"
    ],
    "properties": {
        "etag": "*",
        "interval": "[variables('Schedule').Interval]",
        "queryTimeSpan": "[variables('Schedule').TimeSpan]",
        "enabled": "[variables('Schedule').Enabled]"
    }
}

下表介绍了计划资源的属性。The properties for schedule resources are described in the following table.

元素名称Element name 必须Required 说明Description
Enabledenabled Yes 说明创建警报后该警报是否启用。Specifies whether the alert is enabled when it's created.
intervalinterval Yes 查询运行的频率(以分钟为单位)。How often the query runs in minutes.
queryTimeSpanqueryTimeSpan Yes 用于评估结果的时长(以分钟为单位)。Length of time in minutes over which to evaluate results.

计划资源应该依赖于保存的搜索,以便在计划前创建资源。The schedule resource should depend on the saved search so that it's created before the schedule.

备注

在一个给定的工作区中,计划名称必须是唯一的;两个计划 ID 不能一样,即使它们与不同的已保存搜索相关联。Schedule Name must be unique in a given workspace; two schedules cannot have the same ID even if they are associated with different saved searches. 此外,所有已保存的搜索、计划和使用 Log Analytics API 创建的操作的名称必须小写。Also name for all saved searches, schedules, and actions created with the Log Analytics API must be in lowercase.

操作Actions

一个计划可以有多个操作。A schedule can have multiple actions. 操作可以定义一个或多个要执行的进程,例如发送邮件或启动 Runbook,也可以定义确定搜索结果与某些条件何时匹配的阈值。An action may define one or more processes to perform such as sending a mail or starting a runbook, or it may define a threshold that determines when the results of a search match some criteria. 某些操作将同时定义这两者,以便达到阈值时执行这些进程。Some actions will define both so that the processes are performed when the threshold is met. 可使用 [操作组] 资源或操作资源定义操作。Actions can be defined using [action group] resource or action resource.

存在两类由 Type 属性指定的操作资源。There are two types of action resource specified by the Type property. 一个计划需要一个“Alert”操作,该操作可定义警报规则的详细信息并在创建警报时定义要采取的操作 。A schedule requires one Alert action, which defines the details of the alert rule and what actions are taken when an alert is created. 操作资源的类型为 Microsoft.OperationalInsights/workspaces/savedSearches/schedules/actionsAction resources have a type of Microsoft.OperationalInsights/workspaces/savedSearches/schedules/actions.

警报操作具有以下结构。Alert actions have the following structure. 这包括常见变量和参数,以便可以将此代码片段复制并粘贴到解决方案文件,并更改参数名称。This includes common variables and parameters so that you can copy and paste this code snippet into your solution file and change the parameter names.

{
    "name": "[concat(parameters('workspaceName'), '/', variables('SavedSearch').Name, '/', variables('Schedule').Name, '/', variables('Alert').Name)]",
    "type": "Microsoft.OperationalInsights/workspaces/savedSearches/schedules/actions",
    "apiVersion": "[variables('LogAnalyticsApiVersion')]",
    "dependsOn": [
        "[concat('Microsoft.OperationalInsights/workspaces/', parameters('workspaceName'), '/savedSearches/', variables('SavedSearch').Name, '/schedules/', variables('Schedule').Name)]"
    ],
    "properties": {
        "etag": "*",
        "type": "Alert",
        "name": "[variables('Alert').Name]",
        "description": "[variables('Alert').Description]",
        "severity": "[variables('Alert').Severity]",
        "threshold": {
            "operator": "[variables('Alert').Threshold.Operator]",
            "value": "[variables('Alert').Threshold.Value]",
            "metricsTrigger": {
                "triggerCondition": "[variables('Alert').Threshold.Trigger.Condition]",
                "operator": "[variables('Alert').Trigger.Operator]",
                "value": "[variables('Alert').Trigger.Value]"
            },
        },
        "AzNsNotification": {
            "GroupIds": "[variables('MyAlert').AzNsNotification.GroupIds]",
            "CustomEmailSubject": "[variables('MyAlert').AzNsNotification.CustomEmailSubject]",
            "CustomWebhookPayload": "[variables('MyAlert').AzNsNotification.CustomWebhookPayload]"
        }
    }
}

下表介绍了 Alert 操作资源的属性。The properties for Alert action resources are described in the following tables.

元素名称Element name 必须Required 说明Description
type Yes 操作的类型。Type of the action. 警报操作的类型是 Alert 。This is Alert for alert actions.
name Yes 警报的显示名称。Display name for the alert. 这是警报规则在控制台中的显示名称。This is the name that's displayed in the console for the alert rule.
description No 警报的可选说明。Optional description of the alert.
severity Yes 警报记录的严重等级包括以下值:Severity of the alert record from the following values:

严重critical
警告warning
信息性informational

阈值Threshold

本部分是必需的。This section is required. 它定义警报阈值的属性。It defines the properties for the alert threshold.

元素名称Element name 必须Required 说明Description
Operator Yes 比较运算符包括以下值:Operator for the comparison from the following values:

gt = 大于
lt = 小于
gt = greater than
lt = less than
Value Yes 要比较结果的值。The value to compare the results.
MetricsTriggerMetricsTrigger

本部分为可选。This section is optional. 将其包含在指标度量警报中。Include it for a metric measurement alert.

元素名称Element name 必须Required 说明Description
TriggerCondition Yes 以下值指定该阈值是总违规次数还是连续违规次数:Specifies whether the threshold is for total number of breaches or consecutive breaches from the following values:

总次数
连续次数
Total
Consecutive
Operator Yes 比较运算符包括以下值:Operator for the comparison from the following values:

gt = 大于
lt = 小于
gt = greater than
lt = less than
Value Yes 若要触发警报,该条件必须符合的次数。Number of the times the criteria must be met to trigger the alert.

限制Throttling

本部分为可选。This section is optional. 创建警报后,若希望在一定时间内阻止通过同一规则创建的警报,请包含此部分。Include this section if you want to suppress alerts from the same rule for some amount of time after an alert is created.

元素名称Element name 必须Required 说明Description
DurationInMinutesDurationInMinutes 如果包含了限制元素,则为必需Yes if Throttling element included 从同一警报规则创建警报后,阻止警报的分钟数。Number of minutes to suppress alerts after one from the same alert rule is created.

Azure 操作组Azure action group

Azure 中的所有警报都使用操作组作为用来处理操作的默认机制。All alerts in Azure, use Action Group as the default mechanism for handling actions. 使用操作组,可以将操作指定一次,然后将操作组关联到 Azure 中的多个警报。With Action Group, you can specify your actions once and then associate the action group to multiple alerts - across Azure. 不需要一再重复声明相同的操作。Without the need, to repeatedly declare the same actions over and over again. 操作组支持多个操作 - 包括电子邮件、SMS、语音呼叫、ITSM 连接、自动化 Runbook、Webhook URI,等等。Action Groups support multiple actions - including email, SMS, Voice Call, ITSM Connection, Automation Runbook, Webhook URI and more.

对于已将其警报扩展到 Azure 中的用户- 一个计划现在应当将操作组详细信息与阈值一起传递,以便能够创建警报。For user's who have extended their alerts into Azure - a schedule should now have Action Group details passed along with threshold, to be able to create an alert. 在创建警报前,需要先在操作组中定义电子邮件详细信息、Webhook URL、Runbook 自动化详细信息以及其他操作;可以在门户中通过 Azure Monitor 创建操作组,也可以使用操作组 - 资源模板E-mail details, Webhook URLs, Runbook Automation details, and other Actions, need to be defined in side an Action Group first before creating an alert; one can create Action Group from Azure Monitor in Portal or use Action Group - Resource Template.

元素名称Element name 必须Required 说明Description
AzNsNotificationAzNsNotification Yes Azure 操作组的资源 ID 应与警报相关联,以在满足警报条件时执行必要操作。The resource ID of the Azure action group to be associated with alert for taking necessary actions when alert criteria is met.
CustomEmailSubjectCustomEmailSubject No 将邮件的自定义主题行发送到关联操作组中指定的所有地址。Custom subject line of the mail sent to all addresses specified in associated action group.
CustomWebhookPayloadCustomWebhookPayload No 在关联操作组中定义要发送到所有 Webhook 终结点的自定义有效负载。Customized payload to be sent to all webhook endpoints defined in associated action group. 根据 Webhook 的需要确定格式,且格式应为有效的序列化 JSON。The format depends on what the webhook is expecting and should be a valid serialized JSON.

示例Sample

以下是包含下列资源的解决方案示例:Following is a sample of a solution that includes the following resources:

  • 保存的搜索Saved search
  • 计划Schedule
  • 操作组Action group

此示例使用的是解决方案中常用的标准解决方案参数变量,不同于资源定义中使用的硬编码值。The sample uses standard solution parameters variables that would commonly be used in a solution as opposed to hardcoding values in the resource definitions.

{
    "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
    "contentVersion": "1.0",
    "parameters": {
        "workspaceName": {
            "type": "string",
            "metadata": {
                "Description": "Name of Log Analytics workspace"
            }
        },
        "workspaceregionId": {
            "type": "string",
            "metadata": {
                "Description": "Region of Log Analytics workspace"
            }
        },
        "actiongroup": {
            "type": "string",
            "metadata": {
                "Description": "List of action groups for alert actions separated by semicolon"
            }
        }
    },
    "variables": {
        "SolutionName": "MySolution",
        "SolutionVersion": "1.0",
        "SolutionPublisher": "Contoso",
        "ProductName": "SampleSolution",
        "LogAnalyticsApiVersion-Search": "2017-03-15-preview",
        "LogAnalyticsApiVersion-Solution": "2015-11-01-preview",
        "MySearch": {
            "displayName": "Error records by hour",
            "query": "MyRecord_CL | summarize AggregatedValue = avg(Rating_d) by Instance_s, bin(TimeGenerated, 60m)",
            "category": "Samples",
            "name": "Samples-Count of data"
        },
        "MyAlert": {
            "Name": "[toLower(concat('myalert-',uniqueString(resourceGroup().id, deployment().name)))]",
            "DisplayName": "My alert rule",
            "Description": "Sample alert. Fires when 3 error records found over hour interval.",
            "Severity": "critical",
            "ThresholdOperator": "gt",
            "ThresholdValue": 3,
            "Schedule": {
                "Name": "[toLower(concat('myschedule-',uniqueString(resourceGroup().id, deployment().name)))]",
                "Interval": 15,
                "TimeSpan": 60
            },
            "MetricsTrigger": {
                "TriggerCondition": "Consecutive",
                "Operator": "gt",
                "Value": 3
            },
            "ThrottleMinutes": 60,
            "AzNsNotification": {
                "GroupIds": [
                    "[parameters('actiongroup')]"
                ],
                "CustomEmailSubject": "Sample alert"
            }
        }
    },
    "resources": [
        {
            "name": "[concat(variables('SolutionName'), '[' ,parameters('workspacename'), ']')]",
            "location": "[parameters('workspaceRegionId')]",
            "tags": { },
            "type": "Microsoft.OperationsManagement/solutions",
            "apiVersion": "[variables('LogAnalyticsApiVersion-Solution')]",
            "dependsOn": [
                "[resourceId('Microsoft.OperationalInsights/workspaces/savedSearches', parameters('workspacename'), variables('MySearch').Name)]",
                "[resourceId('Microsoft.OperationalInsights/workspaces/savedSearches/schedules', parameters('workspacename'), variables('MySearch').Name, variables('MyAlert').Schedule.Name)]",
                "[resourceId('Microsoft.OperationalInsights/workspaces/savedSearches/schedules/actions', parameters('workspacename'), variables('MySearch').Name, variables('MyAlert').Schedule.Name, variables('MyAlert').Name)]"
            ],
            "properties": {
                "workspaceResourceId": "[resourceId('Microsoft.OperationalInsights/workspaces', parameters('workspacename'))]",
                "referencedResources": [
                ],
                "containedResources": [
                    "[resourceId('Microsoft.OperationalInsights/workspaces/savedSearches', parameters('workspacename'), variables('MySearch').Name)]",
                    "[resourceId('Microsoft.OperationalInsights/workspaces/savedSearches/schedules', parameters('workspacename'), variables('MySearch').Name, variables('MyAlert').Schedule.Name)]",
                    "[resourceId('Microsoft.OperationalInsights/workspaces/savedSearches/schedules/actions', parameters('workspacename'), variables('MySearch').Name, variables('MyAlert').Schedule.Name, variables('MyAlert').Name)]"
                ]
            },
            "plan": {
                "name": "[concat(variables('SolutionName'), '[' ,parameters('workspaceName'), ']')]",
                "Version": "[variables('SolutionVersion')]",
                "product": "[variables('ProductName')]",
                "publisher": "[variables('SolutionPublisher')]",
                "promotionCode": ""
            }
        },
        {
            "name": "[concat(parameters('workspaceName'), '/', variables('MySearch').Name)]",
            "type": "Microsoft.OperationalInsights/workspaces/savedSearches",
            "apiVersion": "[variables('LogAnalyticsApiVersion-Search')]",
            "dependsOn": [ ],
            "tags": { },
            "properties": {
                "etag": "*",
                "query": "[variables('MySearch').query]",
                "displayName": "[variables('MySearch').displayName]",
                "category": "[variables('MySearch').category]"
            }
        },
        {
            "name": "[concat(parameters('workspaceName'), '/', variables('MySearch').Name, '/', variables('MyAlert').Schedule.Name)]",
            "type": "Microsoft.OperationalInsights/workspaces/savedSearches/schedules/",
            "apiVersion": "[variables('LogAnalyticsApiVersion-Search')]",
            "dependsOn": [
                "[concat('Microsoft.OperationalInsights/workspaces/', parameters('workspaceName'), '/savedSearches/', variables('MySearch').Name)]"
            ],
            "properties": {
                "etag": "*",
                "interval": "[variables('MyAlert').Schedule.Interval]",
                "queryTimeSpan": "[variables('MyAlert').Schedule.TimeSpan]",
                "enabled": true
            }
        },
        {
            "name": "[concat(parameters('workspaceName'), '/', variables('MySearch').Name, '/', variables('MyAlert').Schedule.Name, '/', variables('MyAlert').Name)]",
            "type": "Microsoft.OperationalInsights/workspaces/savedSearches/schedules/actions",
            "apiVersion": "[variables('LogAnalyticsApiVersion-Search')]",
            "dependsOn": [
                "[concat('Microsoft.OperationalInsights/workspaces/', parameters('workspaceName'), '/savedSearches/', variables('MySearch').Name, '/schedules/', variables('MyAlert').Schedule.Name)]"
            ],
            "properties": {
                "etag": "*",
                "Type": "Alert",
                "Name": "[variables('MyAlert').DisplayName]",
                "Description": "[variables('MyAlert').Description]",
                "Severity": "[variables('MyAlert').Severity]",
                "Threshold": {
                    "Operator": "[variables('MyAlert').ThresholdOperator]",
                    "Value": "[variables('MyAlert').ThresholdValue]",
                    "MetricsTrigger": {
                        "TriggerCondition": "[variables('MyAlert').MetricsTrigger.TriggerCondition]",
                        "Operator": "[variables('MyAlert').MetricsTrigger.Operator]",
                        "Value": "[variables('MyAlert').MetricsTrigger.Value]"
                    }
                },
                "Throttling": {
                    "DurationInMinutes": "[variables('MyAlert').ThrottleMinutes]"
                },
                "AzNsNotification": {
                    "GroupIds": "[variables('MyAlert').AzNsNotification.GroupIds]",
                    "CustomEmailSubject": "[variables('MyAlert').AzNsNotification.CustomEmailSubject]"
                }
            }
        }
    ]
}

下方参数文件提供了此解决方案的示例值。The following parameter file provides samples values for this solution.

{
    "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentParameters.json#",
    "contentVersion": "1.0.0.0",
    "parameters": {
        "workspacename": {
            "value": "myWorkspace"
        },
        "accountName": {
            "value": "myAccount"
        },
        "workspaceregionId": {
            "value": "China East 2"
        },
        "regionId": {
            "value": "China East 2"
        },
        "pricingTier": {
            "value": "Free"
        },
        "actiongroup": {
            "value": "/subscriptions/3b540246-808d-4331-99aa-917b808a9166/resourcegroups/myTestGroup/providers/microsoft.insights/actiongroups/sample"
        }
    }
}

后续步骤Next steps