磁盘性能指标Disk performance metrics

Azure 在 Azure 门户中提供了指标,这些指标可说明虚拟机 (VM) 和磁盘的性能情况。Azure offers metrics in the Azure portal that provide insight on how your virtual machines (VM) and disks perform. 还可以通过 API 调用来检索这些指标。The metrics can also be retrieved through an API call. 本文分为 3 个部分:This article is broken into 3 subsections:

  • 磁盘 IO、吞吐量和队列深度指标 - 通过这些指标可以从磁盘和虚拟机的角度了解存储性能。Disk IO, throughput and queue depth metrics - These metrics allow you to see the storage performance from the perspective of a disk and a virtual machine.
  • 磁盘突发指标 - 通过这些指标可以观测高级磁盘中的 突发特征。Disk bursting metrics - These are the metrics provide observability into our bursting feature on our premium disks.
  • 存储 IO 利用率指标 - 这些指标有助于确定磁盘存储性能的瓶颈。Storage IO utilization metrics - These metrics help diagnose bottlenecks in your storage performance with disks.

突发额度百分比指标每 5 分钟发放一次,除其之外的所有指标每分钟发放一次。All metrics are emitted every minute, except for the bursting credit percentage metric, which is emitted every 5 minutes.

磁盘 IO、吞吐量和队列深度指标Disk IO, throughput and queue depth metrics

以下指标可用于深入了解 VM 和磁盘 IO、吞吐量以及队列深度性能:The following metrics are available to get insight on VM and Disk IO, throughput, and queue depth performance:

  • OS 磁盘队列深度:当前未完成的等待在 OS 磁盘中读取或写入的 IO 请求数。OS Disk Queue Depth: The number of current outstanding IO requests that are waiting to be read from or written to the OS disk.
  • OS 磁盘读取字节数/秒:每秒从 OS 磁盘中读取的字节数。OS Disk Read Bytes/Sec: The number of bytes that are read in a second from the OS disk.
  • OS 磁盘读取操作次数/秒:每秒从 OS 磁盘读取的输入操作的次数。OS Disk Read Operations/Sec: The number of input operations that are read in a second from the OS disk.
  • OS 磁盘写入字节数/秒:每秒在 OS 磁盘中写入的字节数。OS Disk Write Bytes/Sec: The number of bytes that are written in a second from the OS disk.
  • OS 磁盘写入操作次数/秒:每秒在 OS 磁盘中写入的输出操作的次数。OS Disk Write Operations/Sec: The number of output operations that are written in a second from the OS disk.
  • 数据磁盘队列深度:当前未完成的等待在数据磁盘中读取或写入的 IO 请求数。Data Disk Queue Depth: The number of current outstanding IO requests that are waiting to be read from or written to the data disk(s).
  • 数据磁盘读取字节数/秒:每秒从数据磁盘中读取的字节数。Data Disk Read Bytes/Sec: The number of bytes that are read in a second from the data disk(s).
  • 数据磁盘读取操作次数/秒:每秒从数据磁盘读取的输入操作的次数。Data Disk Read Operations/Sec: The number of input operations that are read in a second from data disk(s).
  • 数据磁盘写入字节数/秒:每秒在数据磁盘中写入的字节数。Data Disk Write Bytes/Sec: The number of bytes that are written in a second from the data disk(s).
  • 数据磁盘写入操作次数/秒:每秒在数据磁盘中写入的输出操作的次数。Data Disk Write Operations/Sec: The number of output operations that are written in a second from data disk(s).
  • 磁盘读取字节数/秒:每秒从附加到某个 VM 的所有磁盘中读取的总字节数。Disk Read Bytes/Sec: The number of total bytes that are read in a second from all disks attached to a VM.
  • 磁盘读取操作次数/秒:每秒从附加到某个 VM 的所有磁盘中读取的输入操作的次数。Disk Read Operations/Sec: The number of input operations that are read in a second from all disks attached to a VM.
  • 磁盘写入字节数/秒:每秒在附加到某个 VM 的所有磁盘中写入的字节数。Disk Write Bytes/Sec: The number of bytes that are written in a second from all disks attached to a VM.
  • 磁盘写入操作次数/秒:每秒在附加到某个 VM 的所有磁盘中写入的输出操作的次数。Disk Write Operations/Sec: The number of output operations that are written in a second from all disks attached to a VM.

突发指标Bursting metrics

以下指标有助于观测高级磁盘中的突发特征:The following metrics help with observability into our bursting feature on our premium disks:

  • 数据磁盘最大突发带宽:数据磁盘可突发的吞吐量上限。Data Disk Max Burst Bandwidth: The throughput limit that the data disk(s) can burst up to.
  • OS 磁盘最大突发带宽:OS 磁盘可突发的吞吐量上限。OS Disk Max Burst Bandwidth: The throughput limit that the OS disk can burst up to.
  • 数据磁盘最大突发 IOPS:数据磁盘可突发的 IOPS 上限。Data Disk Max Burst IOPS: the IOPS limit that the data disk(s) can burst up to.
  • OS 磁盘最大突发 IOPS:OS 磁盘可突发的 IOPS 上限。OS Disk Max Burst IOPS: The IOPS limit that the OS disk can burst up to.
  • 数据磁盘目标带宽:数据磁盘在不突发的情况下可达到的吞吐量限制。Data Disk Target Bandwidth: The throughput limit that the data(s) disk can achieve without bursting.
  • OS 磁盘目标带宽:OS 磁盘在不突发的情况下可达到的吞吐量限制。OS Disk Target Bandwidth: The throughput limit that the OS disk can achieve without bursting.
  • 数据磁盘目标 IOPS:数据磁盘在不突发的情况下可达到的 IOPS 限制。Data Disk Target IOPS: The IOPS limit that the data disk(s) can achieve without bursting.
  • OS 磁盘目标 IOPS:数据磁盘在不突发的情况下可达到的 IOPS 限制。OS Disk Target IOPS: The IOPS limit that the data disk(s) can achieve without bursting.
  • 数据磁盘已用突发 BPS 额度百分比:数据磁盘所用吞吐量突发的累积百分比。Data Disk Used Burst BPS Credits Percentage: The accumulated percentage of the throughput burst used for the data disk(s). 每隔 5 分钟发出一次。Emitted on a 5 minute interval.
  • OS 磁盘已用突发 BPS 额度百分比:OS 磁盘所用吞吐量突发的累积百分比。OS Disk Used Burst BPS Credits Percentage: The accumulated percentage of the throughput burst used for the OS disk. 每隔 5 分钟发出一次。Emitted on a 5 minute interval.
  • 数据磁盘已用突发 IO 额度百分比:数据磁盘所用 IOPS 突发的累积百分比。Data Disk Used Burst IO Credits Percentage: The accumulated percentage of the IOPS burst used for the data disk(s). 每隔 5 分钟发出一次。Emitted on a 5 minute interval.
  • OS 磁盘已用突发 IO 额度百分比:OS 磁盘所用 IOPS 突发的累积百分比。OS Disk Used Burst IO Credits Percentage: The accumulated percentage of the IOPS burst used for the OS disk. 每隔 5 分钟发出一次。Emitted on a 5 minute interval.

存储 IO 利用率指标Storage IO utilization metrics

以下指标有助于确定虚拟机和磁盘组合的瓶颈。The following metrics help diagnose bottleneck in your Virtual Machine and Disk combination. 仅在使用启用了高级功能的 VM 时才可使用这些指标。These metrics are only available when using premium enabled VM. 这些指标适用于除 Ultra 之外的所有磁盘类型。These metrics are available for all disk types except for Ultra.

有助于诊断磁盘 IO 上限的指标:Metrics that help diagnose disk IO capping:

  • 已使用的数据磁盘 IOPS 的百分比:通过将完成的数据磁盘 IOPS 与预配的数据磁盘 IOPS 相比计算得出的百分比。Data Disk IOPS Consumed Percentage: The percentage calculated by the data disk IOPS completed over the provisioned data disk IOPS. 如果此数为 100%,则表明正在运行的应用程序达到你的数据磁盘 IOPS 限制的 IO 上限。If this amount is at 100%, your application running is IO capped from your data disk's IOPS limit.
  • 已使用的数据磁盘带宽百分比:通过将完成的数据磁盘吞吐量与预配的数据磁盘吞吐量相比计算得出的百分比。Data Disk Bandwidth Consumed Percentage: The percentage calculated by the data disk throughput completed over the provisioned data disk throughput. 如果此数为 100%,则表明正在运行的应用程序达到你的数据磁盘带宽限制的 IO 上限。If this amount is at 100%, your application running is IO capped from your data disk's bandwidth limit.
  • 已使用的 OS 磁盘 IOPS 的百分比:通过将完成的 OS 磁盘 IOPS 与预配的 OS 磁盘 IOPS 相比计算得出的百分比。OS Disk IOPS Consumed Percentage: The percentage calculated by the OS disk IOPS completed over the provisioned OS disk IOPS. 如果此数为 100%,则表明正在运行的应用程序达到你的 OS 磁盘 IOPS 限制的 IO 上限。If this amount is at 100%, your application running is IO capped from your OS disk's IOPS limit.
  • 已使用的 OS 磁盘带宽百分比:通过将完成的 OS 磁盘吞吐量与预配的 OS 磁盘吞吐量相比计算得出的百分比。OS Disk Bandwidth Consumed Percentage: The percentage calculated by the OS disk throughput completed over the provisioned OS disk throughput. 如果此数为 100%,则表明正在运行的应用程序达到你的 OS 磁盘带宽限制的 IO 上限。If this amount is at 100%, your application running is IO capped from your OS disk's bandwidth limit.

有助于诊断 VM IO 上限的指标:Metrics that help diagnose VM IO capping:

  • 已使用的 VM 缓存 IOPS 的百分比:通过将完成的总 IOPS 与最大缓存虚拟机 IOPS 限制相比计算得出的百分比。VM Cached IOPS Consumed Percentage: The percentage calculated by the total IOPS completed over the max cached virtual machine IOPS limit. 如果此数为 100%,则表明正在运行的应用程序达到你的 VM 缓存 IOPS 限制的 IO 上限。If this amount is at 100%, your application running is IO capped from your VM's cached IOPS limit.
  • 已使用的 VM 缓存带宽百分比:通过将完成的总磁盘吞吐量与最大缓存虚拟机吞吐量相比计算得出的百分比。VM Cached Bandwidth Consumed Percentage: The percentage calculated by the total disk throughput completed over the max cached virtual machine throughput. 如果此数为 100%,则表明正在运行的应用程序达到你的 VM 缓存带宽限制的 IO 上限。If this amount is at 100%, your application running is IO capped from your VM's cached bandwidth limit.
  • 已使用的 VM 未缓存 IOPS 的百分比:通过将虚拟机上完成的总 IOPS 与最大非缓存虚拟机 IOPS 限制相比计算得出的百分比。VM uncached IOPS Consumed Percentage: The percentage calculated by the total IOPS on a virtual machine completed over the max uncached virtual machine IOPS limit. 如果此数为 100%,则表明正在运行的应用程序达到你的 VM 非缓存 IOPS 限制的 IO 上限。If this amount is at 100%, your application running is IO capped from your VM's uncached IOPS limit.
  • 已使用的 VM 非缓存带宽百分比:通过将虚拟机上完成的总磁盘吞吐量与最大预配虚拟机吞吐量相比计算得出的百分比。VM Uncached Bandwidth Consumed Percentage: The percentage calculated by the total disk throughput on a virtual machine completed over the max provisioned virtual machine throughput. 如果此数为 100%,则表明正在运行的应用程序达到你的 VM 非缓存带宽限制的 IO 上限。If this amount is at 100%, your application running is IO capped from your VM's uncached bandwidth limit.

存储 IO 指标示例Storage IO metrics example

让我们通过一个示例来了解如何使用这些新的存储 IO 利用率指标来帮助我们调试系统中的瓶颈。Let's run through an example of how to use these new Storage IO utilization metrics to help us debug where a bottleneck is in our system. 系统设置与前面的示例相同,但是,这次附加的 OS 磁盘不缓存。The system setup is the same as the previous example, except this time the attached OS disk is not cached.

设置:Setup:

  • Standard_D8s_v3Standard_D8s_v3
    • 缓存的 IOPS:16,000Cached IOPS: 16,000
    • 非缓存 IOPS:12,800Uncached IOPS: 12,800
  • P30 OS 磁盘P30 OS disk
    • IOPS:5,000IOPS: 5,000
    • 主机缓存:已禁用Host caching: Disabled
  • 两个 P30 数据磁盘 × 2Two P30 data disks × 2
    • IOPS:5,000IOPS: 5,000
    • 主机缓存:读取/写入Host caching: Read/write
  • 两个 P30 数据磁盘 × 2Two P30 data disks × 2
    • IOPS:5,000IOPS: 5,000
    • 主机缓存:已禁用Host caching: Disabled

让我们对可创建 IO 活动的此虚拟机和磁盘组合运行基准测试。Let's run a benchmarking test on this virtual machine and disk combination that creates IO activity. 若要了解如何在 Azure 上对存储 IO 进行基准测试,请参阅在 Azure 磁盘存储上对应用程序进行基准测试To learn how to benchmark storage IO on Azure, see Benchmark your application on Azure Disk Storage. 通过基准测试工具,可以看到 VM 和磁盘组合可以实现 22,800 个 IOPS:From the benchmarking tool, you can see that the VM and disk combination can achieve 22,800 IOPS:

突出显示了 r=22.8k 的 f i o 输出屏幕截图。

Standard_D8s_v3 最多可以达到 28,600 个 IOPS。The Standard_D8s_v3 can achieve a total of 28,600 IOPS. 让我们使用这些指标,调查正在进行的操作,并确定存储 IO 瓶颈。Using the metrics, let's investigate what's going on and identify our storage IO bottleneck. 在左侧窗格中,选择“指标”:On the left pane, select Metrics:

左侧窗格中突出显示了“指标”的屏幕截图。

我们先看一下 已使用的 VM 缓存 IOPS 的百分比 指标:Let's first take a look at our VM Cached IOPS Consumed Percentage metric:

显示“已使用的 VM 缓存 IOPS 的百分比”的屏幕截图。

此指标告诉我们,在 VM 上分配给缓存 IOPS 的 16,000 IOPS 中,使用了 61%。This metric tells us that 61% of the 16,000 IOPS allotted to the cached IOPS on the VM is being used. 此百分比意味着存储 IO 瓶颈与缓存的磁盘无关,因为该指标值没有达到 100%。This percentage means that the storage IO bottleneck isn't with the disks that are cached because it isn't at 100%. 我们现在来看一下“已使用的 VM 未缓存 IOPS 的百分比”指标:Now let's look at our VM Uncached IOPS Consumed Percentage metric:

显示“已使用的 VM 未缓存 IOPS 的百分比”的屏幕截图。

此指标为 100%。This metric is at 100%. 此指标告诉我们,在 VM 上分配给未缓存 IOPS 的所有 12,800 IOPS 均已使用。It tells us that all of the 12,800 IOPS allotted to the uncached IOPS on the VM are being used. 可以修正此问题的一种方法是,将 VM 大小更改为可处理更多 IO 的更大的大小。One way we can remediate this issue is to change the size of our VM to a larger size that can handle the additional IO. 但在执行此操作之前,我们先来看一下附加的磁盘,以了解这些磁盘发生了多少 IOPS。But before we do that, let's look at the attached disk to find out how many IOPS they are seeing. 通过查看“已使用的 OS 磁盘 IOPS 的百分比”来看一下 OS 磁盘:Check the OS Disk by looking at the OS Disk IOPS Consumed Percentage:

显示“已使用的 OS 磁盘 IOPS 的百分比”的屏幕截图。

此指标告诉我们,在为此 P30 OS 磁盘预配的 5,000 IOPS 中,使用了大约 90%。This metric tells us that around 90% of the 5,000 IOPS provisioned for this P30 OS disk is being used. 此百分比意味着 OS 磁盘上没有瓶颈。This percentage means there's no bottleneck at the OS disk. 现在我们通过查看 已使用的数据磁盘 IOPS 的百分比 来看一下附加到 VM 的数据磁盘:Now let's check the data disks that are attached to the VM by looking at the Data Disk IOPS Consumed Percentage:

显示“已使用的数据磁盘 IOPS 的百分比”的屏幕截图。

此指标告诉我们,在所有附加的磁盘上,平均已使用的 IOPS 百分比约为 42%。This metric tells us that the average IOPS consumed percentage across all the disks attached is around 42%. 此百分比是根据由这些磁盘使用且未从主机缓存中予以服务的 IOPS 计算得出的。This percentage is calculated based on the IOPS that are used by the disks, and aren't being served from the host cache. 让我们通过对这些指标应用拆分并按 LUN 值拆分来深入了解此指标:Let's drill deeper into this metric by applying splitting on these metrics and splitting by the LUN value:

显示拆分后“已使用的数据磁盘 IOPS 的百分比”的屏幕截图。

此指标告诉我们,LUN 3 和 2 上附加的数据磁盘使用了它们约 85% 的预配 IOPS。This metric tells us the data disks attached on LUN 3 and 2 are using around 85% of their provisioned IOPS. 下面是 VM 和磁盘体系结构中 IO 情况的示意图:Here is a diagram of what the IO looks like from the VM and disks architecture:

存储 IO 指标示例的示意图。

后续步骤Next steps