Azure Monitor 中的 Windows 和 Linux 性能数据源Windows and Linux performance data sources in Azure Monitor

Windows 和 Linux 中的性能计数器提供对硬件组件、操作系统和应用程序性能的见解。Performance counters in Windows and Linux provide insight into the performance of hardware components, operating systems, and applications. 除聚合性能数据以用于长期分析和报告外,Azure Monitor 还可以定期收集性能计数器以进行近实时 (NRT) 分析。Azure Monitor can collect performance counters at frequent intervals for Near Real Time (NRT) analysis in addition to aggregating performance data for longer term analysis and reporting.

性能计数器

配置性能计数器Configuring Performance counters

通过“高级设置”中的“数据”菜单配置性能计数器。Configure Performance counters from the Data menu in Advanced Settings.

首次为新的工作区配置 Windows 或 Linux 性能计数器时,可以选择快速创建几个通用的计数器。When you first configure Windows or Linux Performance counters for a new workspace, you are given the option to quickly create several common counters. 将这些计数器在一个复选框中依次列出。They are listed with a checkbox next to each. 请确保已选中所有想要首先创建的计数器,并单击“添加选定的性能计数器Ensure that any counters you want to initially create are checked and then click Add the selected performance counters.

对于 Windows 性能计数器,可以为每个性能计数器选择一个特定实例。For Windows performance counters, you can choose a specific instance for each performance counter. 对于 Linux 性能计数器,选择的每个计数器的实例会应用于父计数器的所有子计数器。For Linux performance counters, the instance of each counter that you choose applies to all child counters of the parent counter. 下表显示 Linux 和 Windows 性能计数器的可用通用实例。The following table shows the common instances available to both Linux and Windows performance counters.

实例名称Instance name 说明Description
_Total_Total 所有实例的总计Total of all the instances
* 所有实例All instances
(/|/var)(/|/var) 匹配命名的实例:/ 或 /varMatches instances named: / or /var

Windows 性能计数器Windows performance counters

配置 Windows 性能计数器

遵循以下步骤添加要收集的新 Windows 性能计数器。Follow this procedure to add a new Windows performance counter to collect.

  1. 按照 object(instance)\counter 格式在文本框中键入计数器的名称。Type the name of the counter in the text box in the format object(instance)\counter. 开始键入时,会显示通用计数器的匹配列表。When you start typing, you are presented with a matching list of common counters. 可以选择列表中的计数器或者键入自己的计数器。You can either select a counter from the list or type in one of your own. 还可以通过指定 object\counter 返回特定计数器的所有实例。You can also return all instances for a particular counter by specifying object\counter.

    在从命名实例中收集 SQL Server 性能计数器时,所有命名实例计数器以 MSSQL$ 开头,并且后面接实例的名称。When collecting SQL Server performance counters from named instances, all named instance counters start with MSSQL$ and followed by the name of the instance. 例如,若要从命名 SQL 实例 INST2 的数据库性能对象收集所有数据库的“日志缓存命中率”计数器,请指定 MSSQL$INST2:Databases(*)\Log Cache Hit RatioFor example, to collect the Log Cache Hit Ratio counter for all databases from the Database performance object for named SQL instance INST2, specify MSSQL$INST2:Databases(*)\Log Cache Hit Ratio.

  2. 单击 + 或按 Enter 将计数器添加到列表中。Click + or press Enter to add the counter to the list.

  3. 添加计数器后,计数器将把 10 秒作为“采样间隔”的默认时间。When you add a counter, it uses the default of 10 seconds for its Sample Interval. 如果想要降低收集的性能数据的存储要求,可以将此值更改为更高值,最高可达 1800 秒(30 分钟)。You can change this to a higher value of up to 1800 seconds (30 minutes) if you want to reduce the storage requirements of the collected performance data.

  4. 添加完计数器后,单击屏幕顶部的“保存”按钮保存配置。When you're done adding counters, click the Save button at the top of the screen to save the configuration.

Linux 性能计数器Linux performance counters

配置 Linux 性能计数器

遵循以下步骤添加要收集的新 Linux 性能计数器。Follow this procedure to add a new Linux performance counter to collect.

  1. 默认情况下,所有配置更改均会自动推送到所有代理。By default, all configuration changes are automatically pushed to all agents. 对于 Linux 代理,配置文件会发送到 Fluentd 数据收集器。For Linux agents, a configuration file is sent to the Fluentd data collector. 如果想在每个 Linux 代理上手动修改此文件,请取消选中“将下面的配置应用到我的 Linux 计算机”框并遵循下面的指南。If you wish to modify this file manually on each Linux agent, then uncheck the box Apply below configuration to my Linux machines and follow the guidance below.
  2. 按照 object(instance)\counter 格式在文本框中键入计数器的名称。Type the name of the counter in the text box in the format object(instance)\counter. 开始键入时,会显示通用计数器的匹配列表。When you start typing, you are presented with a matching list of common counters. 可以选择列表中的计数器或者键入自己的计数器。You can either select a counter from the list or type in one of your own.
  3. 单击 + 或按 Enter 将计数器添加到此对象的其他计数器列表中。Click + or press Enter to add the counter to the list of other counters for the object.
  4. 一个对象的所有计数器使用相同的“采样间隔”。All counters for an object use the same Sample Interval. 默认为 10 秒。The default is 10 seconds. 如果想要降低收集的性能数据的存储要求,可以将此值更改为更高值,最高可达 1800 秒(30 分钟)。You change this to a higher value of up to 1800 seconds (30 minutes) if you want to reduce the storage requirements of the collected performance data.
  5. 添加完计数器后,单击屏幕顶部的“保存”按钮保存配置。When you're done adding counters, click the Save button at the top of the screen to save the configuration.

在配置文件中配置 Linux 性能计数器Configure Linux performance counters in configuration file

可以不使用 Azure 门户配置 Linux 性能计数器,而是在 Linux 代理上编辑配置文件。Instead of configuring Linux performance counters using the Azure portal, you have the option of editing configuration files on the Linux agent. 要收集的性能指标由 /etc/opt/microsoft/omsagent/<workspace id>/conf/omsagent.conf 中的配置控制。Performance metrics to collect are controlled by the configuration in /etc/opt/microsoft/omsagent/<workspace id>/conf/omsagent.conf.

要收集的性能指标的每个对象或类别应在配置文件中作为单个 <source> 元素进行定义。Each object, or category, of performance metrics to collect should be defined in the configuration file as a single <source> element. 语法遵循下面的模式。The syntax follows the pattern below.

<source>
    type oms_omi  
    object_name "Processor"
    instance_regex ".*"
    counter_name_regex ".*"
    interval 30s
</source>

下表介绍了此元素中的参数。The parameters in this element are described in the following table.

parametersParameters 说明Description
object_nameobject_name 收集的对象名称。Object name for the collection.
instance_regexinstance_regex 用于定义要收集的实例的正则表达式A regular expression defining which instances to collect. .* 指定所有实例。The value: .* specifies all instances. 要仅收集 _Total 实例的处理器指标,可以指定 _TotalTo collect processor metrics for only the _Total instance, you could specify _Total. 要仅收集 crond 或 sshd 实例的进程指标,可以指定 (crond\|sshd)To collect process metrics for only the crond or sshd instances, you could specify: (crond\|sshd).
counter_name_regexcounter_name_regex 用于定义要收集的对象计数器的正则表达式A regular expression defining which counters (for the object) to collect. 要收集对象的所有计数器,请指定:.*To collect all counters for the object, specify: .*. 例如,要仅收集内存对象的交换空间计数器,可以指定 .+Swap.+To collect only swap space counters for the memory object, for example, you could specify: .+Swap.+
intervalinterval 收集对象计数器时采用的频率。Frequency at which the object's counters are collected.

下表列出了可以在配置文件中指定的对象和计数器。The following table lists the objects and counters that you can specify in the configuration file. 在 Azure Monitor 中收集 Linux 应用程序的性能计数器中所述,对于某些应用程序,还有其他计数器可用。There are additional counters available for certain applications as described in Collect performance counters for Linux applications in Azure Monitor.

对象名称Object Name 计数器名称Counter Name
逻辑磁盘Logical Disk 可用 Inode 百分比% Free Inodes
逻辑磁盘Logical Disk 可用空间百分比% Free Space
逻辑磁盘Logical Disk 已用 Inode 百分比% Used Inodes
逻辑磁盘Logical Disk 已用空间百分比% Used Space
逻辑磁盘Logical Disk 磁盘读取字节数/秒Disk Read Bytes/sec
逻辑磁盘Logical Disk 磁盘读取数/秒Disk Reads/sec
逻辑磁盘Logical Disk 磁盘传输数/秒Disk Transfers/sec
逻辑磁盘Logical Disk 磁盘写入字节数/秒Disk Write Bytes/sec
逻辑磁盘Logical Disk 磁盘写入数/秒Disk Writes/sec
逻辑磁盘Logical Disk 可用 MB 数Free Megabytes
逻辑磁盘Logical Disk 逻辑磁盘字节数/秒Logical Disk Bytes/sec
内存Memory 可用内存百分比% Available Memory
内存Memory 可用交换空间百分比% Available Swap Space
内存Memory 已用内存百分比% Used Memory
内存Memory 已用交换空间百分比% Used Swap Space
内存Memory 可用内存 MB 数Available MBytes Memory
内存Memory 可用交换空间 MB 数Available MBytes Swap
内存Memory 页面读取数/秒Page Reads/sec
内存Memory 页面写入数/秒Page Writes/sec
内存Memory 页面数/秒Pages/sec
内存Memory 已用交换空间 MB 数Used MBytes Swap Space
内存Memory 已用内存 MB 数Used Memory MBytes
网络Network 已传输的字节数总计Total Bytes Transmitted
网络Network 已接收的字节数总计Total Bytes Received
网络Network 字节数总计Total Bytes
网络Network 已传输的包数总计Total Packets Transmitted
网络Network 已接收的包数总计Total Packets Received
网络Network Rx 错误数总计Total Rx Errors
网络Network Tx 错误数总计Total Tx Errors
网络Network 冲突数总计Total Collisions
物理磁盘Physical Disk 平均值磁盘秒数/读取Avg. Disk sec/Read
物理磁盘Physical Disk 平均值磁盘秒数/传输Avg. Disk sec/Transfer
物理磁盘Physical Disk 平均值磁盘秒数/写入Avg. Disk sec/Write
物理磁盘Physical Disk 物理磁盘字节数/秒Physical Disk Bytes/sec
过程Process 特权时间百分比Pct Privileged Time
过程Process 用户时间百分比Pct User Time
过程Process 已用内存 KB 数Used Memory kBytes
过程Process 虚拟共享内存Virtual Shared Memory
处理器Processor DPC 时间百分比% DPC Time
处理器Processor 空闲时间百分比% Idle Time
处理器Processor 中断时间百分比% Interrupt Time
处理器Processor IO 等待时间百分比% IO Wait Time
处理器Processor 良好时间百分比% Nice Time
处理器Processor 特权时间百分比% Privileged Time
处理器Processor 处理器时间百分比% Processor Time
处理器Processor 用户时间百分比% User Time
系统System 可用物理内存Free Physical Memory
系统System 分页文件中的可用空间Free Space in Paging Files
系统System 可用虚拟内存Free Virtual Memory
系统System 进程Processes
系统System 分页文件中存储的大小Size Stored In Paging Files
系统System 运行时间Uptime
系统System 用户Users

下面是性能指标的默认配置。Following is the default configuration for performance metrics.

<source>
    type oms_omi
    object_name "Physical Disk"
    instance_regex ".*"
    counter_name_regex ".*"
    interval 5m
</source>

<source>
    type oms_omi
    object_name "Logical Disk"
    instance_regex ".*
    counter_name_regex ".*"
    interval 5m
</source>

<source>
    type oms_omi
    object_name "Processor"
    instance_regex ".*
    counter_name_regex ".*"
    interval 30s
</source>

<source>
    type oms_omi
    object_name "Memory"
    instance_regex ".*"
    counter_name_regex ".*"
    interval 30s
</source>

数据收集Data collection

Azure Monitor 以指定的采样间隔在已安装相应计数器的所有代理上收集所有指定的性能计数器。Azure Monitor collects all specified performance counters at their specified sample interval on all agents that have that counter installed. 数据未聚合,可在日志分析工作区指定的持续时间内,在所有日志查询视图中获取原始数据。The data is not aggregated, and the raw data is available in all log query views for the duration specified by your log analytics workspace.

性能记录属性Performance record properties

性能记录具有 Perf 类型,并且具有下表中的属性。Performance records have a type of Perf and have the properties in the following table.

属性Property 说明Description
ComputerComputer 从中收集事件的计算机。Computer that the event was collected from.
CounterNameCounterName 性能计数器的名称Name of the performance counter
CounterPathCounterPath 性能计数器的完整路径,以 \\<Computer>\object(instance)\counter 格式显示。Full path of the counter in the form \\<Computer>\object(instance)\counter.
CounterValueCounterValue 计数器的数值。Numeric value of the counter.
InstanceNameInstanceName 事件实例的名称。Name of the event instance. 无实例时为空。Empty if no instance.
ObjectNameObjectName 性能对象的名称Name of the performance object
SourceSystemSourceSystem 从中收集数据的代理类型。Type of agent the data was collected from.

OpsManager - Windows 代理,直接连接或通过 SCOM 连接OpsManager - Windows agent, either direct connect or SCOM
Linux - 所有 Linux 代理Linux - All Linux agents
AzureStorage - Azure 诊断AzureStorage - Azure Diagnostics
TimeGeneratedTimeGenerated 对数据进行采样的日期和时间。Date and time the data was sampled.

大小估计值Sizing estimates

以 10 秒间隔收集特定计数器的粗略估计值约为每个实例每天 1 MB。A rough estimate for collection of a particular counter at 10-second intervals is about 1 MB per day per instance. 可以使用以下公式估计特定计数器的存储要求。You can estimate the storage requirements of a particular counter with the following formula.

1 MB x(计数器数)x(代理数)x(实例数)1 MB x (number of counters) x (number of agents) x (number of instances)

使用性能记录的日志查询Log queries with Performance records

下表提供了检索性能记录的不同日志查询的示例。The following table provides different examples of log queries that retrieve Performance records.

查询Query 说明Description
性能Perf 所有性能数据All Performance data
Perf | where Computer == "MyComputer"Perf | where Computer == "MyComputer" 特定计算机中的所有性能数据All Performance data from a particular computer
Perf | where CounterName == "Current Disk Queue Length"Perf | where CounterName == "Current Disk Queue Length" 特定计数器的所有性能数据All Performance data for a particular counter
Perf | where ObjectName == "Processor" and CounterName == "% Processor Time" and InstanceName == "_Total" | summarize AVGCPU = avg(CounterValue) by ComputerPerf | where ObjectName == "Processor" and CounterName == "% Processor Time" and InstanceName == "_Total" | summarize AVGCPU = avg(CounterValue) by Computer 所有计算机的平均 CPU 使用率Average CPU Utilization across all computers
Perf | where CounterName == "% Processor Time" | summarize AggregatedValue = max(CounterValue) by ComputerPerf | where CounterName == "% Processor Time" | summarize AggregatedValue = max(CounterValue) by Computer 所有计算机的最大 CPU 使用率Maximum CPU Utilization across all computers
Perf | where ObjectName == "LogicalDisk" and CounterName == "Current Disk Queue Length" and Computer == "MyComputerName" | summarize AggregatedValue = avg(CounterValue) by InstanceNamePerf | where ObjectName == "LogicalDisk" and CounterName == "Current Disk Queue Length" and Computer == "MyComputerName" | summarize AggregatedValue = avg(CounterValue) by InstanceName 指定计算机的所有实例上的当前磁盘队列平均长度Average Current Disk Queue length across all the instances of a given computer
Perf | where CounterName == "Disk Transfers/sec" | summarize AggregatedValue = percentile(CounterValue, 95) by ComputerPerf | where CounterName == "Disk Transfers/sec" | summarize AggregatedValue = percentile(CounterValue, 95) by Computer 每秒所有计算机上磁盘传输的第 95 百分位数95th Percentile of Disk Transfers/Sec across all computers
Perf | where CounterName == "% Processor Time" and InstanceName == "_Total" | summarize AggregatedValue = avg(CounterValue) by bin(TimeGenerated, 1h), ComputerPerf | where CounterName == "% Processor Time" and InstanceName == "_Total" | summarize AggregatedValue = avg(CounterValue) by bin(TimeGenerated, 1h), Computer 每小时所有计算机 CPU 使用率的平均值Hourly average of CPU usage across all computers
Perf | where Computer == "MyComputer" and CounterName startswith_cs "%" and InstanceName == "_Total" | summarize AggregatedValue = percentile(CounterValue, 70) by bin(TimeGenerated, 1h), CounterNamePerf | where Computer == "MyComputer" and CounterName startswith_cs "%" and InstanceName == "_Total" | summarize AggregatedValue = percentile(CounterValue, 70) by bin(TimeGenerated, 1h), CounterName 每小时特定计算机的每个 % 百分比计数器的第 70 百分位数Hourly 70 percentile of every % percent counter for a particular computer
Perf | where CounterName == "% Processor Time" and InstanceName == "_Total" and Computer == "MyComputer" | summarize ["min(CounterValue)"] = min(CounterValue), ["avg(CounterValue)"] = avg(CounterValue), ["percentile75(CounterValue)"] = percentile(CounterValue, 75), ["max(CounterValue)"] = max(CounterValue) by bin(TimeGenerated, 1h), ComputerPerf | where CounterName == "% Processor Time" and InstanceName == "_Total" and Computer == "MyComputer" | summarize ["min(CounterValue)"] = min(CounterValue), ["avg(CounterValue)"] = avg(CounterValue), ["percentile75(CounterValue)"] = percentile(CounterValue, 75), ["max(CounterValue)"] = max(CounterValue) by bin(TimeGenerated, 1h), Computer 每小时特定计算机的 CPU 使用率的平均值、最小值、最大值和第 75 百分位数Hourly average, minimum, maximum, and 75-percentile CPU usage for a specific computer
Perf | where ObjectName == "MSSQL$INST2:Databases" and InstanceName == "master"Perf | where ObjectName == "MSSQL$INST2:Databases" and InstanceName == "master" 所有性能数据来自命名 SQL Server 实例 INST2 的 master 数据库的数据库性能对象。All Performance data from the Database performance object for the master database from the named SQL Server instance INST2.

后续步骤Next steps