Blob 存储中的延迟Latency in Blob storage

延迟(有时被称为响应时间)是指应用程序必须等待请求完成的时间。Latency, sometimes referenced as response time, is the amount of time that an application must wait for a request to complete. 延迟可能会直接影响应用程序的性能。Latency can directly affect an application’s performance. 对于在循环中有人工操作的情况(如处理信用卡交易或加载网页),低延迟通常很重要。Low latency is often important for scenarios with humans in the loop, such as conducting credit card transactions or loading web pages. 需要以较高速率处理传入事件(如遥测日志记录或 IoT 事件)的系统,也需要低延迟。Systems that need to process incoming events at high rates, such as telemetry logging or IoT events, also require low latency. 本文介绍如何了解和测量块 blob 上操作的延迟,以及如何针对低延迟设计应用程序。This article describes how to understand and measure latency for operations on block blobs, and how to design your applications for low latency.

Azure 存储为块 blob 提供两个不同的性能选项:高级和标准。Azure Storage offers two different performance options for block blobs: premium and standard. 与标准块 blob 相比,高级块 blob 可以通过高性能 SSD 磁盘显著降低延迟并提高一致性。Premium block blobs offer significantly lower and more consistent latency than standard block blobs via high-performance SSD disks. 有关详细信息,请参阅 Azure Blob 存储:热、冷以及存档访问层中的高级性能块 blob存储For more information, see Premium performance block blob storage in Azure Blob storage: hot, cool, and archive access tiers.

关于 Azure 存储延迟About Azure Storage latency

Azure 存储延迟与 Azure 存储操作的请求速率相关。Azure Storage latency is related to request rates for Azure Storage operations. 请求速率也称每秒的输入/输出操作 (IOPS)。Request rates are also known as input/output operations per second (IOPS).

若要计算请求速率,请首先确定完成每个请求所需的时间长度,然后计算每秒可处理的请求数。To calculate the request rate, first determine the length of time that each request takes to complete, then calculate how many requests can be processed per second. 例如,假设请求需要 50 毫秒 (ms) 才能完成。For example, assume that a request takes 50 milliseconds (ms) to complete. 如果某个应用程序使用一个线程,并且该线程具有一个未完成的读取或写入操作,则应达到 20 IOPS (每个请求 1 秒或 1000 毫秒/50 毫秒)。An application using one thread with one outstanding read or write operation should achieve 20 IOPS (1 second or 1000 ms / 50 ms per request). 从理论上讲,如果线程计数翻倍变为 2,则应用程序应能够达到 40 IOPS。Theoretically, if the thread count is doubled to two, then the application should be able to achieve 40 IOPS. 如果每个线程的未完成异步读取或写入操作翻倍变为 2,则应用程序应能够达到 80 IOPS。If the outstanding asynchronous read or write operations for each thread are doubled to two, then the application should be able to achieve 80 IOPS.

在实践中,由于客户端中来自任务计划、上下文切换等的开销,请求速率并非总是线性缩放。In practice, request rates do not always scale so linearly, due to overhead in the client from task scheduling, context switching, and so forth. 在服务端,可能会由于 Azure 存储系统压力、所用存储介质的差异、其他工作负载噪音、维护任务以及其他因素,导致延迟会反复不定。On the service side, there can be variability in latency due to pressure on the Azure Storage system, differences in the storage media used, noise from other workloads, maintenance tasks, and other factors. 最后,由于拥塞、重新路由或其他中断,客户端与服务器之间的网络连接可能会影响 Azure 存储延迟。Finally, the network connection between the client and the server may affect Azure Storage latency due to congestion, rerouting, or other disruptions.

Azure 存储带宽(也称为吞吐量)与请求速率相关,可通过将请求速率 (IOPS) 乘以请求大小进行计算。Azure Storage bandwidth, also referred to as throughput, is related to the request rate and can be calculated by multiplying the request rate (IOPS) by the request size. 例如,假设每秒 160 个请求,每 256 KiB 的数据会导致每秒 40960 KiB 或每秒 40 MiB 的吞吐量。For example, assuming 160 requests per second, each 256 KiB of data results in throughput of 40,960 KiB per second or 40 MiB per second.

块 blob 的延迟指标Latency metrics for block blobs

Azure 存储为块 blob 提供两个延迟指标。Azure Storage provides two latency metrics for block blobs. 可以在 Azure 门户中查看这些指标:These metrics can be viewed in the Azure portal:

  • 端对端 (E2E) 延迟测量 Azure 存储接收到请求的第一个数据包直到 Azure 存储接收到响应的最后一个数据包的客户端确认之间的时间间隔。End-to-end (E2E) latency measures the interval from when Azure Storage receives the first packet of the request until Azure Storage receives a client acknowledgment on the last packet of the response.

  • 服务器延迟测量从 Azure 存储接收到请求的最后一个数据包直到从 Azure 存储返回响应的第一个数据包之间的时间间隔。Server latency measures the interval from when Azure Storage receives the last packet of the request until the first packet of the response is returned from Azure Storage.

下图显示了调用 Get Blob 操作的示例工作负荷的平均成功 E2E 延迟平均成功服务器延迟The following image shows the Average Success E2E Latency and Average Success Server Latency for a sample workload that calls the Get Blob operation:

屏幕截图显示了 Get Blob 操作的延迟指标

在正常情况下,端到端延迟和服务器延迟之间的间隔很小,这就是图像对示例工作负荷显示的内容。Under normal conditions, there is little gap between end-to-end latency and server latency, which is what the image shows for the sample workload.

如果查看端到端和服务器延迟指标,发现端到端延迟明显高于服务器延迟,请调查并解决其他延迟的源。If you review your end-to-end and server latency metrics, and find that end-to-end latency is significantly higher than server latency, then investigate and address the source of the additional latency.

如果你的端到端和服务器延迟相似,但你需要较低的延迟,则考虑迁移到高级块 blob 存储。If your end-to-end and server latency are similar, but you require lower latency, then consider migrating to premium block blob storage.

影响延迟的因素Factors influencing latency

影响延迟的主要因素是操作规模。The main factor influencing latency is operation size. 由于通过网络传输并由 Azure 存储处理的数据量,完成更大的操作需要更长的时间。It takes longer to complete larger operations, due to the amount of data being transferred over the network and processed by Azure Storage.

下图显示了各种规模的操作的总时间。The following diagram shows the total time for operations of various sizes. 对于少量的数据,延迟时间间隔主要用于处理请求,而不是传输数据。For small amounts of data, the latency interval is predominantly spent handling the request, rather than transferring data. 仅当操作规模增大时延迟时间间隔才会略微增加(下图中标记为 1)。The latency interval increases only slightly as the operation size increases (marked 1 in the diagram below). 随着操作规模进一步增大,传输数据需要花费更多时间,以便在请求处理和数据传输之间拆分总延迟时间间隔(在下图中标记为 2)。As the operation size further increases, more time is spent on transferring data, so that the total latency interval is split between request handling and data transfer (marked 2 in the diagram below). 对于较大的操作规模,延迟时间间隔几乎都花费在传输数据上,请求处理很大程度上无关紧要(下图中标记为 3)。With larger operation sizes, the latency interval is almost exclusively spent on transferring data and the request handling is largely insignificant (marked 3 in the diagram below).

屏幕截图显示了按操作规模划分的总操作时间

诸如并发和线程等客户端配置因素也会影响延迟。Client configuration factors such as concurrency and threading also affect latency. 总体吞吐量取决于任意给定时间点在运行的存储请求数,以及应用程序如何处理线程。Overall throughput depends on how many storage requests are in flight at any given point in time and on how your application handles threading. 包括 CPU、内存、本地存储和网络接口在内的客户端资源也会影响延迟。Client resources including CPU, memory, local storage, and network interfaces can also affect latency.

处理 Azure 存储请求需要客户端 CPU 和内存资源。Processing Azure Storage requests requires client CPU and memory resources. 如果客户端由于虚拟机功率不足或系统中某些进程失控而面临压力,则可用来处理 Azure 存储请求的资源较少。If the client is under pressure due to an underpowered virtual machine or some runaway process in the system, there are fewer resources available to process Azure Storage requests. 任何争用或缺少客户端资源都会导致端到端的延迟增加,而不会增加服务器延迟,从而增加两个指标之间的差距。Any contention or lack of client resources will result in an increase in end-to-end latency without an increase in server latency, increasing the gap between the two metrics.

同样重要的是客户端与 Azure 存储之间的网络接口和网络管道。Equally important is the network interface and network pipe between the client and Azure Storage. 仅物理距离就会是一个重要因素,例如,客户端 VM 位于 Azure 区域中还是在本地。Physical distance alone can be a significant factor, for example if a client VM is in a different Azure region or on-premises. 其他因素(如网络跃点、ISP 路由和互联网状态)会影响总体存储延迟。Other factors such as network hops, ISP routing, and internet state can influence overall storage latency.

若要评估延迟,请先针对你的情况建立基线指标。To assess latency, first establish baseline metrics for your scenario. 基线指标为你提供应用程序环境上下文中预期的端到端延迟和服务器延迟,具体取决于你的工作负荷配置文件、应用程序配置设置、客户端资源、网络管道以及其他因素。Baseline metrics provide you with the expected end-to-end and server latency in the context of your application environment, depending on your workload profile, application configuration settings, client resources, network pipe, and other factors. 如果具有基线指标,则可以更轻松地识别异常情况与正常情况。When you have baseline metrics, you can more easily identify abnormal versus normal conditions. 使用基线指标,还可以观察更改的参数(例如应用程序配置或 VM 大小)的影响。Baseline metrics also enable you to observe the effects of changed parameters, such as application configuration or VM sizes.

后续步骤Next steps