Azure Monitor 中的日志查询入门Get started with log queries in Azure Monitor

Note

在完成本教程之前,应当先完成 Azure Monitor Log Analytics 入门You should complete Get started with Azure Monitor Log Analytics before completing this tutorial.

Note

可以在自己的 Log Analytics 环境中完成此练习,也可以使用我们的演示环境,其中包含大量样本数据。You can work through this exercise in your own Log Analytics environment, or you can use our Demo environment, which includes plenty of sample data.

本教程介绍如何在 Azure Monitor 中编写日志查询。In this tutorial you will learn to write log queries in Azure Monitor. 具体内容包括:It will teach you how to:

  • 了解查询的结构Understand queries' structure
  • 将查询结果排序Sort query results
  • 筛选查询结果Filter query results
  • 指定时间范围Specify a time range
  • 选择结果中要包含的字段Select which fields to include in the results
  • 定义和使用自定义字段Define and use custom fields
  • 聚合和分组结果Aggregate and group results

有关在 Azure 门户中使用 Log Analytics 的教程,请参阅 Azure Monitor Log Analytics 入门For a tutorial on using Log Analytics in the Azure portal, see Get started with Azure Monitor Log Analytics.
有关 Azure Monitor 中的日志查询的详细信息,请参阅 Azure Monitor 中的日志查询概述For more details on log queries in Azure Monitor, see Overview of log queries in Azure Monitor.

编写新查询Writing a new query

查询可以从表名或 search 命令开始。Queries can start with either a table name or the search command. 首先应从表名开始,因为它为查询定义了明确的范围,并可以改善查询性能和结果的相关性。You should start with a table name, since it defines a clear scope for the query and improves both query performance and relevance of the results.

Note

Azure Monitor 使用的 Kusto 查询语言区分大小写。The Kusto query language used by Azure Monitor is case-sensitive. 语言关键字通常以小写编写。Language keywords are typically written in lower-case. 在查询中使用表或列名时,请确保使用正确的大小写,如架构窗格中所示。When using names of tables or columns in a query, make sure to use the correct case, as shown on the schema pane.

基于表的查询Table-based queries

Azure Monitor 在表中组织日志数据,每个表由多个列组成。Azure Monitor organizes log data in tables, each composed of multiple columns. 所有表和列都显示在 Analytics 门户中的 Log Analytics 中的架构窗格内。All tables and columns are shown on the schema pane in Log Analytics in the Analytics portal. 找到所需的表,然后看看其中的一些数据:Identify a table that you're interested in and then take a look at a bit of data:

SecurityEvent
| take 10

上面所示的查询从 SecurityEvent 表返回 10 个结果(不遵循特定的顺序)。The query shown above returns 10 results from the SecurityEvent table, in no specific order. 这是大致浏览表并了解其结构和内容的常用方法。This is a very common way to take a glance at a table and understand its structure and content. 让我们探讨表的生成方式:Let's examine how it's built:

  • 查询从表名 SecurityEvent 开始 - 此部分定义查询的范围。The query starts with the table name SecurityEvent - this part defines the scope of the query.
  • 竖线 (|) 字符分隔命令,第一个命令的输出是后一个命令的输入。The pipe (|) character separates commands, so the output of the first one in the input of the following command. 可以添加任意数目的管道元素。You can add any number of piped elements.
  • 管道下面是 take 命令,它从表中返回特定数目的任意记录。Following the pipe is the take command, which returns a specific number of arbitrary records from the table.

实际上,即使不添加 | take 10,我们也可以运行查询 - 该查询仍然有效,只不过它最多会返回 10,000 个结果。We could actually run the query even without adding | take 10 - that would still be valid, but it could return up to 10,000 results.

搜索查询Search queries

搜索查询的结构化程度不高,通常更适合用于查找在任何列中包含特定值的记录:Search queries are less structured, and generally more suited for finding records that include a specific value in any of their columns:

search in (SecurityEvent) "Cryptographic"
| take 10

此查询在 SecurityEvent 表中搜索包含短语“Cryptographic”的记录。This query searches the SecurityEvent table for records that contain the phrase "Cryptographic". 返回并显示了其中的 10 条记录。Of those records, 10 records will be returned and displayed. 如果省略 in (SecurityEvent) 部分并直接运行 search "Cryptographic",则搜索将遍历所有表,因此花费的时间更长且更低效。 If we omit the in (SecurityEvent) part and just run search "Cryptographic", the search will go over all tables, which would take longer and be less efficient.

Warning

搜索查询通常比基于表的查询慢,因为它们必须处理更多数据。Search queries are typically slower than table-based queries because they have to process more data.

sort 和 topSort and top

虽然 take 可用于获取一些记录,但选择和显示的结果不遵循特定的顺序。While take is useful to get a few records, the results are selected and displayed in no particular order. 若要获取排序的视图,可按首选列排序To get an ordered view, you could sort by the preferred column:

SecurityEvent   
| sort by TimeGenerated desc

不过,这可能会返回过多的结果,此外可能需要一段时间。That could return too many results though and might also take some time. 上述查询按 TimeGenerated 列将整个 SecurityEvent 表排序。The above query sorts the entire SecurityEvent table by the TimeGenerated column. 然后,Analytics 门户将结果限制为仅显示 10,000 条记录。The Analytics portal then limits the display to show only 10,000 records. 当然,这种方法不是最佳的。This approach is of course not optimal.

仅获取最新 10 条记录的最佳方式是使用 top,它会在服务器端将整个表排序,然后返回前几条记录:The best way to get only the latest 10 records is to use top, which sorts the entire table on the server side and then returns the top records:

SecurityEvent
| top 10 by TimeGenerated

降序是默认的排序顺序,因此我们通常省略 desc 参数。输出如下所示:Descending is the default sorting order, so we typically omit the desc argument.The output will look like this:

前 10 条记录

Where:按条件筛选Where: filtering on a condition

顾名思义,筛选器可按特定的条件筛选数据。Filters, as indicated by their name, filter the data by a specific condition. 这是将查询结果限制为相关信息的最常用方法。This is the most common way to limit query results to relevant information.

若要将筛选器添加到查询,请使用 where 运算符,后接一个或多个条件。To add a filter to a query, use the where operator followed by one or more conditions. 例如,以下查询只返回 Level 等于 8SecurityEvent 记录:For example, the following query returns only SecurityEvent records where Level equals 8:

SecurityEvent
| where Level == 8

编写筛选器条件时,可使用以下表达式:When writing filter conditions, you can use the following expressions:

表达式Expression 说明Description 示例Example
== 检查相等性Check equality
(区分大小写)(case-sensitive)
Level == 8
=~ 检查相等性Check equality
(不区分大小写)(case-insensitive)
EventSourceName =~ "microsoft-windows-security-auditing"
!=, <>!=, <> 检查不相等性Check inequality
(两个表达式相同)(both expressions are identical)
Level != 4
andorand, or 需在条件之间使用Required between conditions Level == 16 or CommandLine != ""

若要按多个条件进行筛选,可以使用 andTo filter by multiple conditions, you can either use and:

SecurityEvent
| where Level == 8 and EventID == 4672

或者使用竖线逐个分隔多个 where 元素:or pipe multiple where elements one after the other:

SecurityEvent
| where Level == 8 
| where EventID == 4672

Note

值可以有不同的类型,因此可能需要将其强制转换,以针对正确的类型执行比较。Values can have different types, so you might need to cast them to perform comparison on the correct type. 例如,SecurityEvent Level 列的类型是字符串,因此必须将其强制转换为 intlong 等数字类型,然后才能对其使用数字运算符:SecurityEvent | where toint(Level) >= 10For example, SecurityEvent Level column is of type String, so you must cast it to a numerical type such as int or long, before you can use numerical operators on it: SecurityEvent | where toint(Level) >= 10

指定时间范围Specify a time range

时间选取器Time picker

时间选取器位于“运行”按钮的旁边,指示我们只查询过去 24 小时的记录。The time picker is next to the Run button and indicates we’re querying only records from the last 24 hours. 这是应用到所有查询的默认时间范围。This is the default time range applied to all queries. 如果只要获取过去一个小时的记录,请选择“过去一小时”并再次运行查询。To get only records from the last hour, select Last hour and run the query again.

时间选取器

查询中的时间筛选器Time filter in query

还可以通过将时间筛选器添加到查询来定义自己的时间范围。You can also define your own time range by adding a time filter to the query. 最好是紧靠在表名的后面添加时间筛选器:It’s best to place the time filter immediately after the table name:

SecurityEvent
| where TimeGenerated > ago(30m) 
| where toint(Level) >= 10

在上面的时间筛选器中,ago(30m) 表示“30 分钟之前”,因此,此查询仅返回过去 30 分钟的记录。In the above time filter ago(30m) means "30 minutes ago" so this query only returns records from the last 30 minutes. 其他时间单位包括天 (2d)、分钟 (25m) 和秒 (10s)。Other units of time include days (2d), minutes (25m), and seconds (10s).

投影和扩展:选择和计算列Project and Extend: select and compute columns

使用投影可以选择要包含在结果中的特定列:Use project to select specific columns to include in the results:

SecurityEvent 
| top 10 by TimeGenerated 
| project TimeGenerated, Computer, Activity

前面的示例生成以下输出:The preceding example generates this output:

查询投影结果

还可以使用投影来重命名列,并定义新列。You can also use project to rename columns and define new ones. 以下示例使用项目执行以下操作:The following example uses project to do the following:

  • 仅选择 ComputerTimeGenerated 原始列。Select only the Computer and TimeGenerated original columns.
  • Activity 列重命名为 EventDetailsRename the Activity column to EventDetails.
  • 创建名为 EventCode 的新列。Create a new column named EventCode. substring() 函数用于仅获取 Activity 字段中的前四个字符。The substring() function is used to get only the first four characters from the Activity field.
SecurityEvent
| top 10 by TimeGenerated 
| project Computer, TimeGenerated, EventDetails=Activity, EventCode=substring(Activity, 0, 4)

extend 保留结果集中的所有原始列,并定义其他列。extend keeps all original columns in the result set and defines additional ones. 以下查询使用 extend 添加 EventCode 列。The following query uses extend to add the EventCode column. 请注意,此列可能不会显示在表结果的末尾,在这种情况下,你需要展开记录的详细信息才能查看此列。Note that this column may not display at the end of the table results in which case you would need to expand the details of a record to view it.

SecurityEvent
| top 10 by TimeGenerated
| extend EventCode=substring(Activity, 0, 4)

Summarize:聚合行组Summarize: aggregate groups of rows

使用 summarize 可以根据一个或多个列标识记录组,并向其应用聚合。Use summarize to identify groups of records, according to one or more columns, and apply aggregations to them. summarize 最常见的用途是计数,可以返回每个组中的结果数。The most common use of summarize is count, which returns the number of results in each group.

以下查询检查过去一小时的所有 Perf 记录,按 ObjectName 将其分组,然后统计每个组中的记录数:The following query reviews all Perf records from the last hour, groups them by ObjectName, and counts the records in each group:

Perf
| where TimeGenerated > ago(1h)
| summarize count() by ObjectName

有时,按多个维度定义组会很有利。Sometimes it makes sense to define groups by multiple dimensions. 这些值的每个唯一组合定义了一个单独的组:Each unique combination of these values defines a separate group:

Perf
| where TimeGenerated > ago(1h)
| summarize count() by ObjectName, CounterName

另一个常见用途是对每个组执行数学或统计计算。Another common use is to perform mathematical or statistical calculations on each group. 例如,以下查询计算每台计算机的平均 CounterValueFor example, the following calculates the average CounterValue for each computer:

Perf
| where TimeGenerated > ago(1h)
| summarize avg(CounterValue) by Computer

遗憾的是,此查询的结果没有意义,因为我们混合了不同的性能计数器。Unfortunately, the results of this query are meaningless since we mixed together different performance counters. 若要使此结果更有意义,应单独针对 CounterNameComputer 的每个组合计算平均值:To make this more meaningful, we should calculate the average separately for each combination of CounterName and Computer:

Perf
| where TimeGenerated > ago(1h)
| summarize avg(CounterValue) by Computer, CounterName

按时间列汇总Summarize by a time column

此外,分组结果可以基于时间列或其他连续值。Grouping results can also be based on a time column, or another continuous value. 不过,只是汇总 by TimeGenerated 会针对时间范围内的每一毫秒创建组,因为这些值是唯一的。Simply summarizing by TimeGenerated though would create groups for every single millisecond over the time range, since these are unique values.

若要创建基于连续值的组,最好是使用 bin 将范围划分为可管理的单位。To create groups based on continuous values, it is best to break the range into manageable units using bin. 以下查询分析 Perf 记录,这些记录度量特定计算机上的可用内存 (Available MBytes)。The following query analyzes Perf records that measure free memory (Available MBytes) on a specific computer. 它计算过去 7 天内每 1 小时时段的平均值:It calculates the average value of each 1 hour period over the last 7 days:

Perf 
| where TimeGenerated > ago(7d)
| where Computer == "ContosoAzADDS2" 
| where CounterName == "Available MBytes" 
| summarize avg(CounterValue) by bin(TimeGenerated, 1h)

为了使输出更清晰,请选择在时间图表中显示不同时段的可用内存:To make the output clearer, you select to display it as a time-chart, showing the available memory over time:

查询不同时段的内存

后续步骤Next steps