Azure 监视器自动缩放常用指标Azure Monitor autoscaling common metrics

Note

本文进行了更新,以便使用新的 Azure PowerShell Az 模块。This article has been updated to use the new Azure PowerShell Az module. 你仍然可以使用 AzureRM 模块,至少在 2020 年 12 月之前,它将继续接收 bug 修补程序。You can still use the AzureRM module, which will continue to receive bug fixes until at least December 2020. 若要详细了解新的 Az 模块和 AzureRM 兼容性,请参阅新 Azure Powershell Az 模块简介To learn more about the new Az module and AzureRM compatibility, see Introducing the new Azure PowerShell Az module. 有关 Az 模块安装说明,请参阅安装 Azure PowerShellFor Az module installation instructions, see Install Azure PowerShell.

利用 Azure 监视器自动缩放,可以根据遥测数据(指标)增加或减少正在运行的实例数。Azure Monitor autoscaling allows you to scale the number of running instances up or down, based on telemetry data (metrics). 本文档介绍了你可能想要使用的常用指标。This document describes common metrics that you might want to use. 在 Azure 门户中,可以选择要作为缩放依据的资源指标。In the Azure portal, you can choose the metric of the resource to scale by. 不过,也可以选择其他资源的任何指标来作为缩放依据。However, you can also choose any metric from a different resource to scale by.

Azure Monitor 自动缩放仅适用于虚拟机规模集云服务应用服务 - Web 应用API 管理服务Azure Monitor autoscale applies only to Virtual Machine Scale Sets, Cloud Services, App Service - Web Apps, and API Management services. 其他 Azure 服务使用不同的缩放方法。Other Azure services use different scaling methods.

基于 Resource Manager 的 VM 的计算指标Compute metrics for Resource Manager-based VMs

默认情况下,基于 Resource Manager 的虚拟机和虚拟机规模集发出基本(主机级)指标。By default, Resource Manager-based Virtual Machines and Virtual Machine Scale Sets emit basic (host-level) metrics. 此外,为 Azure VM 和 VMSS 配置诊断数据集合时,Azure 诊断扩展也会发出来宾 OS性能计数器(通常称为“来宾 OS 指标”)。In addition, when you configure diagnostics data collection for an Azure VM and VMSS, the Azure diagnostic extension also emits guest-OS performance counters (commonly known as "guest-OS metrics"). 可在自动缩放规则中使用所有这些指标。You use all these metrics in autoscale rules.

可使用 Get MetricDefinitions API/PoSH/CLI 查看 VMSS 资源的可用指标。You can use the Get MetricDefinitions API/PoSH/CLI to view the metrics available for your VMSS resource.

如果使用 VM 规模集,而且发现特定指标未列出,则可能是诊断扩展已将其禁用If you're using VM scale sets and you don't see a particular metric listed, then it is likely disabled in your diagnostics extension.

如果特定指标未采样或以所需的频率传输,可以更新诊断配置。If a particular metric is not being sampled or transferred at the frequency you want, you can update the diagnostics configuration.

如果发生上述任一情况,请查看 使用 PowerShell 在运行 Windows 的虚拟机中启用 Azure 诊断 ,将 Azure VM 诊断扩展配置和更新为启用该指标。If either preceding case is true, then review Use PowerShell to enable Azure Diagnostics in a virtual machine running Windows about PowerShell to configure and update your Azure VM Diagnostics extension to enable the metric. 这篇文章还包含一个诊断配置文件示例。That article also includes a sample diagnostics configuration file.

基于 Resource Manager 的 Windows 和 Linux VM 的主机指标Host metrics for Resource Manager-based Windows and Linux VMs

默认情况下,将向 Windows 和 Linux 实例中的 Azure VM 和 VMSS 发出以下主机级指标。The following host-level metrics are emitted by default for Azure VM and VMSS in both Windows and Linux instances. 这些指标可描述 Azure VM,但这些指标是从 Azure VM 主机而不是通过来宾 VM 上安装的代理收集的。These metrics describe your Azure VM, but are collected from the Azure VM host rather than via agent installed on the guest VM. 可在自动缩放规则中使用这些指标。You may use these metrics in autoscaling rules.

基于资源管理器的 Windows VM 的来宾 OS 指标Guest OS metrics for Resource Manager-based Windows VMs

在 Azure 中创建 VM 时,使用诊断扩展会启用诊断。When you create a VM in Azure, diagnostics is enabled by using the Diagnostics extension. 诊断扩展会发出一组从 VM 内部获取的指标。The diagnostics extension emits a set of metrics taken from inside of the VM. 这意味着可以自动缩放不是默认发出的指标。This means you can autoscale off of metrics that are not emitted by default.

可以在 PowerShell 中使用以下命令生成指标列表。You can generate a list of the metrics by using the following command in PowerShell.

Get-AzMetricDefinition -ResourceId <resource_id> | Format-Table -Property Name,Unit

可以针对下列指标创建警报:You can create an alert for the following metrics:

指标名称Metric Name 计价单位Unit
\Processor(_Total)% 处理器时间\Processor(_Total)% Processor Time 百分比Percent
\Processor(_Total)% Privileged Time\Processor(_Total)% Privileged Time 百分比Percent
\Processor(_Total)% User Time\Processor(_Total)% User Time 百分比Percent
\Processor Information(_Total)\Processor Frequency\Processor Information(_Total)\Processor Frequency 计数Count
\System\Processes\System\Processes 计数Count
\Process(_Total)\Thread Count\Process(_Total)\Thread Count 计数Count
\Process(_Total)\Handle Count\Process(_Total)\Handle Count 计数Count
\Memory% Committed Bytes In Use\Memory% Committed Bytes In Use 百分比Percent
\Memory\Available Bytes\Memory\Available Bytes 字节Bytes
\Memory\Committed Bytes\Memory\Committed Bytes 字节Bytes
\Memory\Commit Limit\Memory\Commit Limit 字节Bytes
\Memory\Pool Paged Bytes\Memory\Pool Paged Bytes 字节Bytes
\Memory\Pool Nonpaged Bytes\Memory\Pool Nonpaged Bytes 字节Bytes
\PhysicalDisk(_Total)% Disk Time\PhysicalDisk(_Total)% Disk Time 百分比Percent
\PhysicalDisk(_Total)% Disk Read Time\PhysicalDisk(_Total)% Disk Read Time 百分比Percent
\PhysicalDisk(_Total)% Disk Write Time\PhysicalDisk(_Total)% Disk Write Time 百分比Percent
\PhysicalDisk(_Total)\Disk Transfers/sec\PhysicalDisk(_Total)\Disk Transfers/sec 每秒计数CountPerSecond
\PhysicalDisk(_Total)\Disk Reads/sec\PhysicalDisk(_Total)\Disk Reads/sec 每秒计数CountPerSecond
\PhysicalDisk(_Total)\Disk Writes/sec\PhysicalDisk(_Total)\Disk Writes/sec 每秒计数CountPerSecond
\PhysicalDisk(_Total)\Disk Bytes/sec\PhysicalDisk(_Total)\Disk Bytes/sec 每秒字节数BytesPerSecond
\PhysicalDisk(_Total)\Disk Read Bytes/sec\PhysicalDisk(_Total)\Disk Read Bytes/sec 每秒字节数BytesPerSecond
\PhysicalDisk(_Total)\Disk Write Bytes/sec\PhysicalDisk(_Total)\Disk Write Bytes/sec 每秒字节数BytesPerSecond
\PhysicalDisk(_Total)\Avg.磁盘队列长度\PhysicalDisk(_Total)\Avg. Disk Queue Length 计数Count
\PhysicalDisk(_Total)\Avg.磁盘读取队列长度\PhysicalDisk(_Total)\Avg. Disk Read Queue Length 计数Count
\PhysicalDisk(_Total)\Avg.磁盘写入队列长度\PhysicalDisk(_Total)\Avg. Disk Write Queue Length 计数Count
\LogicalDisk(_Total)% Free Space\LogicalDisk(_Total)% Free Space 百分比Percent
\LogicalDisk(_Total)\Free Megabytes\LogicalDisk(_Total)\Free Megabytes 计数Count

Linux VM 的来宾 OS 指标Guest OS metrics Linux VMs

在 Azure 中创建 VM 时,使用诊断扩展会默认启用诊断。When you create a VM in Azure, diagnostics is enabled by default by using Diagnostics extension.

可以在 PowerShell 中使用以下命令生成指标列表。You can generate a list of the metrics by using the following command in PowerShell.

Get-AzMetricDefinition -ResourceId <resource_id> | Format-Table -Property Name,Unit

可以针对下列指标创建警报:You can create an alert for the following metrics:

指标名称Metric Name 计价单位Unit
\Memory\AvailableMemory\Memory\AvailableMemory 字节Bytes
\Memory\PercentAvailableMemory\Memory\PercentAvailableMemory 百分比Percent
\Memory\UsedMemory\Memory\UsedMemory 字节Bytes
\Memory\PercentUsedMemory\Memory\PercentUsedMemory 百分比Percent
\Memory\PercentUsedByCache\Memory\PercentUsedByCache 百分比Percent
\Memory\PagesPerSec\Memory\PagesPerSec 每秒计数CountPerSecond
\Memory\PagesReadPerSec\Memory\PagesReadPerSec 每秒计数CountPerSecond
\Memory\PagesWrittenPerSec\Memory\PagesWrittenPerSec 每秒计数CountPerSecond
\Memory\AvailableSwap\Memory\AvailableSwap 字节Bytes
\Memory\PercentAvailableSwap\Memory\PercentAvailableSwap 百分比Percent
\Memory\UsedSwap\Memory\UsedSwap 字节Bytes
\Memory\PercentUsedSwap\Memory\PercentUsedSwap 百分比Percent
\Processor\PercentIdleTime\Processor\PercentIdleTime 百分比Percent
\Processor\PercentUserTime\Processor\PercentUserTime 百分比Percent
\Processor\PercentNiceTime\Processor\PercentNiceTime 百分比Percent
\Processor\PercentPrivilegedTime\Processor\PercentPrivilegedTime 百分比Percent
\Processor\PercentInterruptTime\Processor\PercentInterruptTime 百分比Percent
\Processor\PercentDPCTime\Processor\PercentDPCTime 百分比Percent
\Processor\PercentProcessorTime\Processor\PercentProcessorTime 百分比Percent
\Processor\PercentIOWaitTime\Processor\PercentIOWaitTime 百分比Percent
\PhysicalDisk\BytesPerSecond\PhysicalDisk\BytesPerSecond 每秒字节数BytesPerSecond
\PhysicalDisk\ReadBytesPerSecond\PhysicalDisk\ReadBytesPerSecond 每秒字节数BytesPerSecond
\PhysicalDisk\WriteBytesPerSecond\PhysicalDisk\WriteBytesPerSecond 每秒字节数BytesPerSecond
\PhysicalDisk\TransfersPerSecond\PhysicalDisk\TransfersPerSecond 每秒计数CountPerSecond
\PhysicalDisk\ReadsPerSecond\PhysicalDisk\ReadsPerSecond 每秒计数CountPerSecond
\PhysicalDisk\WritesPerSecond\PhysicalDisk\WritesPerSecond 每秒计数CountPerSecond
\PhysicalDisk\AverageReadTime\PhysicalDisk\AverageReadTime Seconds
\PhysicalDisk\AverageWriteTime\PhysicalDisk\AverageWriteTime Seconds
\PhysicalDisk\AverageTransferTime\PhysicalDisk\AverageTransferTime Seconds
\PhysicalDisk\AverageDiskQueueLength\PhysicalDisk\AverageDiskQueueLength 计数Count
\NetworkInterface\BytesTransmitted\NetworkInterface\BytesTransmitted 字节Bytes
\NetworkInterface\BytesReceived\NetworkInterface\BytesReceived 字节Bytes
\NetworkInterface\PacketsTransmitted\NetworkInterface\PacketsTransmitted 计数Count
\NetworkInterface\PacketsReceived\NetworkInterface\PacketsReceived 计数Count
\NetworkInterface\BytesTotal\NetworkInterface\BytesTotal 字节Bytes
\NetworkInterface\TotalRxErrors\NetworkInterface\TotalRxErrors 计数Count
\NetworkInterface\TotalTxErrors\NetworkInterface\TotalTxErrors 计数Count
\NetworkInterface\TotalCollisions\NetworkInterface\TotalCollisions 计数Count

常用的应用服务(服务器场)指标Commonly used App Service (Server Farm) metrics

也可以根据常用的 Web 服务器指标(如 Http 队列长度)执行自动缩放。You can also perform autoscale based on common web server metrics such as the Http queue length. 其指标名称为 HttpQueueLengthIts metric name is HttpQueueLength. 以下部分列出了可用的服务器场(应用服务)指标。The following section lists available server farm (App Service) metrics.

Web 应用指标Web Apps metrics

可以在 PowerShell 中使用以下命令生成 Web 应用指标列表。You can generate a list of the Web Apps metrics by using the following command in PowerShell.

Get-AzMetricDefinition -ResourceId <resource_id> | Format-Table -Property Name,Unit

可以针对这些指标发出警报或以其为缩放依据。You can alert on or scale by these metrics.

指标名称Metric Name 计价单位Unit
CpuPercentageCpuPercentage 百分比Percent
MemoryPercentageMemoryPercentage 百分比Percent
DiskQueueLengthDiskQueueLength 计数Count
HttpQueueLengthHttpQueueLength 计数Count
BytesReceivedBytesReceived 字节Bytes
BytesSentBytesSent 字节Bytes

常用的存储指标Commonly used Storage metrics

可以将存储队列长度作为缩放依据,它是存储队列中的消息数目。You can scale by Storage queue length, which is the number of messages in the storage queue. 存储队列长度是一个特殊指标,阈值是每个实例的消息数。Storage queue length is a special metric and the threshold is the number of messages per instance. 例如,如果有两个实例并且阈值设置为 100,则当队列中的消息总数为 200 时会进行缩放。For example, if there are two instances and if the threshold is set to 100, scaling occurs when the total number of messages in the queue is 200. 这两个实例的消息数可能各为 100,或分别为 120 和 80,或者为其他相加大于等于 200 的数字组合。That can be 100 messages per instance, 120 and 80, or any other combination that adds up to 200 or more.

在 Azure 门户的“设置” 边栏选项卡中配置此配置。Configure this setting in the Azure portal in the Settings blade. 若使用 VM 规模集,可以将 Resource Manager 模板中的“自动缩放”设置更新为将 metricName 用作 ApproximateMessageCount ,并传递存储队列的 ID 作为 metricResourceUri 。For VM scale sets, you can update the Autoscale setting in the Resource Manager template to use metricName as ApproximateMessageCount and pass the ID of the storage queue as metricResourceUri.

例如,对于经典存储帐户,自动缩放设置 metricTrigger 将包括:For example, with a Classic Storage Account the autoscale setting metricTrigger would include:

"metricName": "ApproximateMessageCount",
"metricNamespace": "",
"metricResourceUri": "/subscriptions/SUBSCRIPTION_ID/resourceGroups/RES_GROUP_NAME/providers/Microsoft.ClassicStorage/storageAccounts/STORAGE_ACCOUNT_NAME/services/queue/queues/QUEUE_NAME"

对于(非经典)存储帐户,metricTrigger 将包括:For a (non-classic) storage account, the metricTrigger would include:

"metricName": "ApproximateMessageCount",
"metricNamespace": "",
"metricResourceUri": "/subscriptions/SUBSCRIPTION_ID/resourceGroups/RES_GROUP_NAME/providers/Microsoft.Storage/storageAccounts/STORAGE_ACCOUNT_NAME/services/queue/queues/QUEUE_NAME"

常用的服务总线指标Commonly used Service Bus metrics

可以按服务总线队列的长度进行缩放,该长度是服务总线队列中的消息数量。You can scale by Service Bus queue length, which is the number of messages in the Service Bus queue. 服务总线队列长度是一个特殊指标,阈值是每个实例的消息数。Service Bus queue length is a special metric and the threshold is the number of messages per instance. 例如,如果有两个实例并且阈值设置为 100,则当队列中的消息总数为 200 时会进行缩放。For example, if there are two instances and if the threshold is set to 100, scaling occurs when the total number of messages in the queue is 200. 这两个实例的消息数可能各为 100,或分别为 120 和 80,或者为其他相加大于等于 200 的数字组合。That can be 100 messages per instance, 120 and 80, or any other combination that adds up to 200 or more.

若使用 VM 规模集,可以将 Resource Manager 模板中的“自动缩放”设置更新为将 metricName 用作 ApproximateMessageCount ,并传递存储队列的 ID 作为 metricResourceUri 。For VM scale sets, you can update the Autoscale setting in the Resource Manager template to use metricName as ApproximateMessageCount and pass the ID of the storage queue as metricResourceUri.

"metricName": "ApproximateMessageCount",
 "metricNamespace": "",
"metricResourceUri": "/subscriptions/SUBSCRIPTION_ID/resourceGroups/RES_GROUP_NAME/providers/Microsoft.ServiceBus/namespaces/SB_NAMESPACE/queues/QUEUE_NAME"

Note

若使用服务总线,则不存在资源组这一概念,但 Azure Resource Manager 会为每个区域创建一个默认资源组。For Service Bus, the resource group concept does not exist but Azure Resource Manager creates a default resource group per region. 此资源组通常采用“Default-ServiceBus-[region]”的格式。The resource group is usually in the 'Default-ServiceBus-[region]' format. 例如,“Default-ServiceBus-Chinanorth”、“Default-ServiceBus-Chinaeast”等。For example, 'Default-ServiceBus-Chinanorth', 'Default-ServiceBus-Chinaeast' etc.