磁盘基准测试Benchmarking a disk

基准测试是指模拟应用程序的不同工作负荷,针对每个工作负荷来测量应用程序性能这样一个过程。Benchmarking is the process of simulating different workloads on your application and measuring the application performance for each workload. 按照为实现高性能而设计一文中描述的步骤操作。Using the steps described in the designing for high performance article. 通过在托管应用程序的 VM 上运行基准测试工具,可以确定应用程序在高级存储中能够达到的性能级别。By running benchmarking tools on the VMs hosting the application, you can determine the performance levels that your application can achieve with Premium Storage. 在本文中,我们提供了如何对预配了 Azure 高级存储磁盘的标准 DS14 VM 进行基准测试的示例。In this article, we provide you examples of benchmarking a Standard DS14 VM provisioned with Azure Premium Storage disks.

我们使用了常见的基准测试工具 Iometer 和 FIO,分别适用于 Windows 和 Linux。We have used common benchmarking tools Iometer and FIO, for Windows and Linux respectively. 这些工具会生成多个线程,这些线程模拟类似生产的工作负荷,并测量系统性能。These tools spawn multiple threads simulating a production like workload, and measure the system performance. 使用这些工具还可以配置各种参数(例如块大小和队列深度),应用程序的这些参数通常无法更改。Using the tools you can also configure parameters like block size and queue depth, which you normally cannot change for an application. 这样便可以更灵活地在针对不同类型的应用程序工作负荷预配了高级磁盘的高规格 VM 上实现最大性能。This gives you more flexibility to drive the maximum performance on a high scale VM provisioned with premium disks for different types of application workloads. 若要详细了解每种基准测试工具,请参阅 IometerFIOTo learn more about each benchmarking tool visit Iometer and FIO.

要按以下示例进行操作,请创建一个标准 DS14 VM,然后将 11 个高级存储磁盘连接到 VM。To follow the examples below, create a Standard DS14 VM and attach 11 Premium Storage disks to the VM. 在这 11 个磁盘中,将 10 个磁盘的主机缓存配置为“无”,并将它们条带化到名为 NoCacheWrites 的卷中。Of the 11 disks, configure 10 disks with host caching as "None" and stripe them into a volume called NoCacheWrites. 将剩余磁盘上的主机缓存配置为“ReadOnly”,在该磁盘上创建名为 CacheReads 的卷。Configure host caching as "ReadOnly" on the remaining disk and create a volume called CacheReads with this disk. 进行这样的设置以后,便可以看到标准 DS14 VM 展现出最大的读写性能。Using this setup, you are able to see the maximum Read and Write performance from a Standard DS14 VM. 有关创建具有高级磁盘的 DS14 VM 的详细步骤,请转至为实现高性能而设计For detailed steps about creating a DS14 VM with premium disks, go to Designing for high performance.

预热缓存 Warming up the Cache
启用 ReadOnly 主机缓存的磁盘能够提供比磁盘限制更高的 IOPS。The disk with ReadOnly host caching are able to give higher IOPS than the disk limit. 若要通过主机缓存来实现此最大读取性能,首先必须对此磁盘的缓存进行预热。To get this maximum read performance from the host cache, first you must warm up the cache of this disk. 这样可确保需要通过基准测试工具在 CacheReads 卷上实现的读取 IO 实际上可以直接命中缓存而不是磁盘。This ensures that the Read IOs that the benchmarking tool will drive on CacheReads volume, actually hits the cache, and not the disk directly. 命中缓存导致单个启用缓存的磁盘可以实现额外的 IOPS。The cache hits result in additional IOPS from the single cache enabled disk.

重要

每次重启 VM 后,运行基准测试之前必须对缓存进行预热。You must warm up the cache before running benchmarking, every time VM is rebooted.

工具Tools

IometerIometer

在 VM 上下载 Iometer 工具Download the Iometer tool on the VM.

测试文件Test file

Iometer 使用一个测试文件,该文件存储在运行基准测试的卷上。Iometer uses a test file that is stored on the volume on which you run the benchmarking test. Iometer 会尝试完成此测试文件中的读取和写入,以便测量磁盘 IOPS 和吞吐量。It drives Reads and Writes on this test file to measure the disk IOPS and Throughput. 如果没有提供此测试文件,Iometer 会创建一个。Iometer creates this test file if you have not provided one. 在 CacheReads 和 NoCacheWrites 卷上创建名为 iobw.tst 的 200 GB 的测试文件。Create a 200 GB test file called iobw.tst on the CacheReads and NoCacheWrites volumes.

访问规范Access specifications

规范、请求 IO 大小、读/写百分比、随机/顺序百分比都在 Iometer 中使用“访问规范”选项卡进行配置。The specifications, request IO size, % read/write, % random/sequential are configured using the "Access Specifications" tab in Iometer. 为下述每个方案创建一个访问规范。Create an access specification for each of the scenarios described below. 创建访问规范,并使用合适的名称(例如 RandomWrites_8K、RandomReads_8K)进行“保存”。Create the access specifications and "Save" with an appropriate name like - RandomWrites_8K, RandomReads_8K. 在运行测试方案时,请选择相应的规范。Select the corresponding specification when running the test scenario.

最大写入 IOPS 方案的访问规范示例如下所示:An example of access specifications for maximum Write IOPS scenario is shown below,

最大写入 IOPS 方案的访问规范示例

最大 IOPS 测试规范Maximum IOPS test specifications

若要演示最大 IOPS,请使用较小的请求大小。To demonstrate maximum IOPs, use smaller request size. 使用 8k 请求大小,创建随机读写的规范。Use 8K request size and create specifications for Random Writes and Reads.

访问规范Access Specification 请求大小Request size 随机百分比Random % 读取百分比Read %
RandomWrites_8KRandomWrites_8K 8K8K 100100 00
RandomReads_8KRandomReads_8K 8K8K 100100 100100

最大吞吐量测试规范Maximum throughput test specifications

若要演示最大吞吐量,请使用较大的请求大小。To demonstrate maximum Throughput, use larger request size. 使用 64k 请求大小,创建随机读写的规范。Use 64 K request size and create specifications for Random Writes and Reads.

访问规范Access Specification 请求大小Request size 随机百分比Random % 读取百分比Read %
RandomWrites_64KRandomWrites_64K 64 K64 K 100100 00
RandomReads_64KRandomReads_64K 64 K64 K 100100 100100

运行 Iometer 测试Run the Iometer test

执行以下步骤来预热缓存Perform the steps below to warm up cache

  1. 使用显示在下面的值创建两个访问规范:Create two access specifications with values shown below,

    名称Name 请求大小Request size 随机百分比Random % 读取百分比Read %
    RandomWrites_1MBRandomWrites_1MB 1 MB1 MB 100100 00
    RandomReads_1MBRandomReads_1MB 1 MB1 MB 100100 100100
  2. 运行 Iometer 测试,以便使用以下参数初始化缓存磁盘。Run the Iometer test for initializing cache disk with following parameters. 针对目标卷使用三个工作线程,队列深度为 128。Use three worker threads for the target volume and a queue depth of 128. 在“测试设置”选项卡上将测试的“运行时间”持续时间设置为 2 小时。Set the "Run time" duration of the test to 2 hrs on the "Test Setup" tab.

    场景Scenario 目标卷Target Volume 名称Name DurationDuration
    初始化缓存磁盘Initialize Cache Disk CacheReadsCacheReads RandomWrites_1MBRandomWrites_1MB 2 小时2 hrs
  3. 运行 Iometer 测试,以便使用以下参数预热缓存磁盘。Run the Iometer test for warming up cache disk with following parameters. 针对目标卷使用三个工作线程,队列深度为 128。Use three worker threads for the target volume and a queue depth of 128. 在“测试设置”选项卡上将测试的“运行时间”持续时间设置为 2 小时。Set the "Run time" duration of the test to 2 hrs on the "Test Setup" tab.

    场景Scenario 目标卷Target Volume 名称Name DurationDuration
    预热缓存磁盘Warm up Cache Disk CacheReadsCacheReads RandomReads_1MBRandomReads_1MB 2 小时2 hrs

预热缓存磁盘后,继续执行下面列出的测试方案。After cache disk is warmed up, proceed with the test scenarios listed below. 若要运行 Iometer 测试,请为每个目标卷使用至少三个工作线程。To run the Iometer test, use at least three worker threads for each target volume. 对于每个工作线程,请选择目标卷并设置队列深度,并选择一个保存的测试规范(如下表所示),以便运行相应的测试方案。For each worker thread, select the target volume, set queue depth and select one of the saved test specifications, as shown in the table below, to run the corresponding test scenario. 该表还显示了运行这些测试时 IOPS 和吞吐量的预期结果。The table also shows expected results for IOPS and Throughput when running these tests. 所有方案都使用 8 KB 的较小 IO 大小,而队列深度则较高,为 128。For all scenarios, a small IO size of 8 KB and a high queue depth of 128 is used.

测试方案Test Scenario 目标卷Target Volume 名称Name 结果Result
最大Max. 读取 IOPSRead IOPS CacheReadsCacheReads RandomWrites_8KRandomWrites_8K 50,000 IOPS50,000 IOPS
最大Max. 写入 IOPSWrite IOPS NoCacheWritesNoCacheWrites RandomReads_8KRandomReads_8K 64,000 IOPS64,000 IOPS
最大Max. 组合 IOPSCombined IOPS CacheReadsCacheReads RandomWrites_8KRandomWrites_8K 100,000 IOPS100,000 IOPS
NoCacheWritesNoCacheWrites RandomReads_8KRandomReads_8K    
最大Max. 读取 MB/秒Read MB/sec CacheReadsCacheReads RandomWrites_64KRandomWrites_64K 524 MB/秒524 MB/sec
最大Max. 写入 MB/秒Write MB/sec NoCacheWritesNoCacheWrites RandomReads_64KRandomReads_64K 524 MB/秒524 MB/sec
组合 MB/秒Combined MB/sec CacheReadsCacheReads RandomWrites_64KRandomWrites_64K 1000 MB/秒1000 MB/sec
NoCacheWritesNoCacheWrites RandomReads_64KRandomReads_64K    

以下是组合型 IOPS 和吞吐量方案的 Iometer 测试结果的屏幕快照。Below are screenshots of the Iometer test results for combined IOPS and Throughput scenarios.

读写组合最大 IOPSCombined reads and writes maximum IOPS

读写组合最大 IOPS

读写组合最大吞吐量Combined reads and writes maximum throughput

读写组合最大吞吐量

FIOFIO

FIO 是一种常用工具,可以在 Linux VM 上对存储进行基准测试。FIO is a popular tool to benchmark storage on the Linux VMs. 它可以灵活地选择不同的 IO 大小、顺序或随机读取和写入。It has the flexibility to select different IO sizes, sequential or random reads and writes. 它生成的工作线程或进程可以执行指定的 I/O 操作。It spawns worker threads or processes to perform the specified I/O operations. 可以指定每个工作线程使用作业文件时必须执行的 I/O 操作类型。You can specify the type of I/O operations each worker thread must perform using job files. 我们根据以下示例所描述的方案创建了一个作业文件。We created one job file per scenario illustrated in the examples below. 可以更改这些作业文件中的规范,以便对在高级存储上运行的不同工作负荷进行基准测试。You can change the specifications in these job files to benchmark different workloads running on Premium Storage. 在这些示例中,我们将使用运行 Ubuntu 的标准 DS 14 VM。In the examples, we are using a Standard DS 14 VM running Ubuntu. 运行基准测试之前,请使用基准测试部分开头所述的相同设置来预热缓存。Use the same setup described in the beginning of the Benchmarking section and warm up the cache before running the benchmarking tests.

开始之前,请下载 FIO 并将其安装在虚拟机上。Before you begin, download FIO and install it on your virtual machine.

针对 Ubuntu 运行以下命令:Run the following command for Ubuntu,

apt-get install fio

我们在磁盘上使用 4 个工作线程来执行写入操作,4 个工作线程来执行读取操作。We use four worker threads for driving Write operations and four worker threads for driving Read operations on the disks. 写入工作线程推动“nocache”卷上的流量,该卷有 10 个磁盘的缓存设置为“无”。The Write workers are driving traffic on the "nocache" volume, which has 10 disks with cache set to "None". 读取工作线程推动“readcache”卷上的流量,该卷有 1 个磁盘的缓存设置为“ReadOnly”。The Read workers are driving traffic on the "readcache" volume, which has one disk with cache set to "ReadOnly".

最大写入 IOPSMaximum write IOPS

使用以下规范创建作业文件,以便获得最大写入 IOPS。Create the job file with following specifications to get maximum Write IOPS. 将其命名为“fiowrite.ini”。Name it "fiowrite.ini".

[global]
size=30g
direct=1
iodepth=256
ioengine=libaio
bs=8k

[writer1]
rw=randwrite
directory=/mnt/nocache
[writer2]
rw=randwrite
directory=/mnt/nocache
[writer3]
rw=randwrite
directory=/mnt/nocache
[writer4]
rw=randwrite
directory=/mnt/nocache

请注意以下重要事项,这些事项必须符合前面部分讨论的设计准则。Note the follow key things that are in line with the design guidelines discussed in previous sections. 这些规范是实现最大 IOPS 所必需的。These specifications are essential to drive maximum IOPS,

  • 较高的队列深度:256。A high queue depth of 256.
  • 较小的块大小:8 KB。A small block size of 8 KB.
  • 多个执行随机写入的线程。Multiple threads performing random writes.

运行以下命令,开始进行 30 秒的 FIO 测试:Run the following command to kick off the FIO test for 30 seconds,

sudo fio --runtime 30 fiowrite.ini

进行测试时,就能够看到 VM 和高级磁盘传送的写入 IOPS 数。While the test runs, you are able to see the number of write IOPS the VM and Premium disks are delivering. 如以下示例所示,DS14 VM 传送的写入 IOPS 达到了最大限制:50,000 IOPS。As shown in the sample below, the DS14 VM is delivering its maximum write IOPS limit of 50,000 IOPS.
正在传送的写入 IOPS VM 和高级磁盘的数量

最大读取 IOPSMaximum read IOPS

使用以下规范创建作业文件,以便获得最大读取 IOPS。Create the job file with following specifications to get maximum Read IOPS. 将其命名为“fioread.ini”。Name it "fioread.ini".

[global]
size=30g
direct=1
iodepth=256
ioengine=libaio
bs=8k

[reader1]
rw=randread
directory=/mnt/readcache
[reader2]
rw=randread
directory=/mnt/readcache
[reader3]
rw=randread
directory=/mnt/readcache
[reader4]
rw=randread
directory=/mnt/readcache

请注意以下重要事项,这些事项必须符合前面部分讨论的设计准则。Note the follow key things that are in line with the design guidelines discussed in previous sections. 这些规范是实现最大 IOPS 所必需的。These specifications are essential to drive maximum IOPS,

  • 较高的队列深度:256。A high queue depth of 256.
  • 较小的块大小:8 KB。A small block size of 8 KB.
  • 多个执行随机写入的线程。Multiple threads performing random writes.

运行以下命令,开始进行 30 秒的 FIO 测试:Run the following command to kick off the FIO test for 30 seconds,

sudo fio --runtime 30 fioread.ini

进行测试时,就能够看到 VM 和高级磁盘传送的读取 IOPS 数。While the test runs, you are able to see the number of read IOPS the VM and Premium disks are delivering. 如以下示例所示,DS14 VM 传送了 64,000 个以上的读取 IOPS。As shown in the sample below, the DS14 VM is delivering more than 64,000 Read IOPS. 这是磁盘和缓存性能的组合。This is a combination of the disk and the cache performance.

最大读取和写入 IOPSMaximum read and write IOPS

使用以下规范创建作业文件,以便获得最大读写组合 IOPS。Create the job file with following specifications to get maximum combined Read and Write IOPS. 将其命名为“fioreadwrite.ini”。Name it "fioreadwrite.ini".

[global]
size=30g
direct=1
iodepth=128
ioengine=libaio
bs=4k

[reader1]
rw=randread
directory=/mnt/readcache
[reader2]
rw=randread
directory=/mnt/readcache
[reader3]
rw=randread
directory=/mnt/readcache
[reader4]
rw=randread
directory=/mnt/readcache

[writer1]
rw=randwrite
directory=/mnt/nocache
rate_iops=12500
[writer2]
rw=randwrite
directory=/mnt/nocache
rate_iops=12500
[writer3]
rw=randwrite
directory=/mnt/nocache
rate_iops=12500
[writer4]
rw=randwrite
directory=/mnt/nocache
rate_iops=12500

请注意以下重要事项,这些事项必须符合前面部分讨论的设计准则。Note the follow key things that are in line with the design guidelines discussed in previous sections. 这些规范是实现最大 IOPS 所必需的。These specifications are essential to drive maximum IOPS,

  • 较高的队列深度:128。A high queue depth of 128.
  • 较小的块大小:4 KB。A small block size of 4 KB.
  • 多个执行随机读取和写入的线程。Multiple threads performing random reads and writes.

运行以下命令,开始进行 30 秒的 FIO 测试:Run the following command to kick off the FIO test for 30 seconds,

sudo fio --runtime 30 fioreadwrite.ini

进行测试时,就能够看到 VM 和高级磁盘传送的组合型读取和写入 IOPS 数。While the test runs, you are able to see the number of combined read and write IOPS the VM and Premium disks are delivering. 如以下示例所示,DS14 VM 传送了 100,000 个以上的组合型读取和写入 IOPS。As shown in the sample below, the DS14 VM is delivering more than 100,000 combined Read and Write IOPS. 这是磁盘和缓存性能的组合。This is a combination of the disk and the cache performance.
组合型读取和写入 IOPS

最大组合吞吐量Maximum combined throughput

若要获得最大读写组合吞吐量,请使用较大的块大小和大的队列深度,并通过多个线程执行读取和写入操作。To get the maximum combined Read and Write Throughput, use a larger block size and large queue depth with multiple threads performing reads and writes. 可以使用 64 KB 的块大小,128 的队列深度。You can use a block size of 64 KB and queue depth of 128.

后续步骤Next steps

继续阅读有关针对高性能进行设计的文章。Proceed to our article on designing for high performance.

在该文中,你将为原型创建一个类似于现有应用程序的清单。In that article, you create a checklist similar to your existing application for the prototype. 使用各种能够用来模拟工作负荷并衡量原型应用程序性能的基准测试工具。Using Benchmarking tools you can simulate the workloads and measure performance on the prototype application. 这样做可以确定哪些磁盘产品可以满足或超过你的应用程序性能要求。By doing so, you can determine which disk offering can match or surpass your application performance requirements. 然后,就可以将相同的准则实施到生产型应用程序中。Then you can implement the same guidelines for your production application.