Azure 页 Blob 概述Overview of Azure page blobs

Azure 存储提供了三种类型的 Blob 存储:块 Blob、追加 Blob 和页 Blob。Azure Storage offers three types of blob storage: Block Blobs, Append Blobs and page blobs. 块 Blob 由块组成,非常适合用于存储文本或二进制文件,以及高效上传大型文件。Block blobs are composed of blocks and are ideal for storing text or binary files, and for uploading large files efficiently. 追加 Blob 也由块组成,但它们已针对追加操作进行优化,因此非常适合用于日志记录方案。Append blobs are also made up of blocks, but they are optimized for append operations, making them ideal for logging scenarios. 页 Blob 由总大小可达 8 TB 的 512 字节页面组成,专为频繁的随机读/写操作而设计。Page blobs are made up of 512-byte pages up to 8 TB in total size and are designed for frequent random read/write operations. 页 Blob 是 Azure IaaS 磁盘的基础。Page blobs are the foundation of Azure IaaS Disks. 本文重点介绍页 Blob 的功能和优势。This article focuses on explaining the features and benefits of page blobs.

页 Blob 是 512 字节页面的集合,提供读/写任意字节范围的功能。Page blobs are a collection of 512-byte pages, which provide the ability to read/write arbitrary ranges of bytes. 因此,页 Blob 非常适用于存储基于索引的稀疏数据结构,如虚拟机和数据库的 OS 磁盘与数据磁盘。Hence, page blobs are ideal for storing index-based and sparse data structures like OS and data disks for Virtual Machines and Databases. 例如,Azure SQL 数据库使用页 Blob 作为数据库的基础持久性存储。For example, Azure SQL DB uses page blobs as the underlying persistent storage for its databases. 此外,页 Blob 往往还用于支持基于范围的更新的文件。Moreover, page blobs are also often used for files with Range-Based updates.

Azure 页 Blob 的重要功能包括 REST 接口、基础存储持久性,以及无缝迁移到 Azure 的功能。Key features of Azure page blobs are its REST interface, the durability of the underlying storage, and the seamless migration capabilities to Azure. 下一部分将更详细地介绍这些功能。These features are discussed in more detail in the next section. 此外,有两种类型的存储目前支持 Azure 页 Blob:高级存储和标准存储。In addition, Azure page blobs are currently supported on two types of storage: Premium Storage and Standard Storage. 高级存储专门针对需要持续高性能和低延迟的工作负荷而设计,因此,高级页 Blob 非常适合用于高性能存储方案。Premium Storage is designed specifically for workloads requiring consistent high performance and low latency making premium page blobs ideal for high performance storage scenarios. 标准存储帐户更具成本效益,可用于运行对延迟不太敏感的工作负荷。Standard storage accounts are more cost effective for running latency-insensitive workloads.

示例用例Sample use cases

让我们从 Azure IaaS 磁盘着手,讨论页 Blob 的几种用例。Let's discuss a couple of use cases for page blobs starting with Azure IaaS Disks. Azure 页 Blob 是 Azure IaaS 虚拟磁盘平台的主干。Azure page blobs are the backbone of the virtual disks platform for Azure IaaS. Azure OS 磁盘和数据磁盘都实现为虚拟磁盘,其中的数据持久保存在 Azure 存储平台中,然后传送到虚拟机以获得最大性能。Both Azure OS and data disks are implemented as virtual disks where data is durably persisted in the Azure Storage platform and then delivered to the virtual machines for maximum performance. Azure 磁盘以 Hyper-V VHD 格式保存,并在 Azure 存储中存储为页 BlobAzure Disks are persisted in Hyper-V VHD format and stored as a page blob in Azure Storage. 除了对 Azure IaaS VM 使用虚拟磁盘以外,页 Blob 还可实现 PaaS 和 DBaaS 方案,例如 Azure SQL 数据库服务,该服务目前使用页 Blob 存储 SQL 数据,以便针对数据库快速执行随机读写操作。In addition to using virtual disks for Azure IaaS VMs, page blobs also enable PaaS and DBaaS scenarios such as Azure SQL DB service, which currently uses page blobs for storing SQL data, enabling fast random read-write operations for the database. 另一个示例是,如果使用 PaaS 服务作为共享媒体来访问协作式视频编辑应用程序,则页 Blob 可以实现对媒体中随机位置的快速访问。Another example would be if you have a PaaS service for shared media access for collaborative video editing applications, page blobs enable fast access to random locations in the media. 此外,多个用户可以使用页 Blob 快速高效地编辑和合并同一媒体。It also enables fast and efficient editing and merging of the same media by multiple users.

第一方 Azure 服务(例如 Azure Site Recovery 和 Azure 备份)以及许多第三方开发商已使用页 Blob 的 REST 接口实现了行业领先的创新。First party Azure services like Azure Site Recovery, Azure Backup, as well as many third-party developers have implemented industry-leading innovations using page blob's REST interface. 下面是在 Azure 上实现的一些独特方案:Following are some of the unique scenarios implemented on Azure:

  • 应用程序主导的增量快照管理:应用程序可以利用页 Blob 快照和 REST API 来保存应用程序检查点,而不会产生高昂的数据复制成本。Application-directed incremental snapshot management: Applications can leverage page blob snapshots and REST APIs for saving the application checkpoints without incurring costly duplication of data. Azure 存储支持页 Blob 的本地快照,这类快照不要求复制整个 Blob。Azure Storage supports local snapshots for page blobs, which don't require copying the entire blob. 使用这些公共快照 API 还可以访问和复制快照之间的增量数据。These public snapshot APIs also enable accessing and copying of deltas between snapshots.
  • 将应用程序和数据从本地实时迁移到云中:复制本地数据并使用 REST API 将数据直接写入 Azure 页 Blob,同时,本地 VM 可继续保持运行。Live migration of application and data from on premises to cloud: Copy the on premises data and use REST APIs to write directly to an Azure page blob while the on premises VM continues to run. 与目标同步后,可以使用该数据快速故障转移到 Azure VM。Once the target has caught up, you can quickly failover to Azure VM using that data. 这样,便可以在几乎不造成停机的情况下,将 VM 和虚拟磁盘从本地迁移到云中,因为数据迁移在后台发生,同时我们可以继续使用 VM,并且故障转移所需的停机时间很短(以分钟计)。In this way, you can migrate your VMs and virtual disks from on premises to cloud with minimal downtime since the data migration occurs in the background while you continue to use the VM and the downtime needed for failover will be short (in minutes).
  • 基于 SAS 的共享访问,可以实现支持并发控制的方案,例如多个读取者和单个写入者。SAS-based shared access, which enables scenarios like multiple-readers and single-writer with support for concurrency control.

页 Blob 功能Page blob features

REST APIREST API

请参阅以下文档,开始使用页 Blob 进行开发Refer to the following document to get started with developing using page blobs. 例如,请查看如何使用用于 .NET 的存储客户端库来访问页 Blob。As an example, look at how to access page blobs using Storage Client Library for .NET.

下图描绘了帐户、容器和页 Blob 之间的整体关系。The following diagram describes the overall relationships between account, containers, and page blobs.

显示帐户、容器和页 blob 之间的关系的屏幕截图

创建指定大小的空页 BlobCreating an empty page blob of a specified size

首先,获取对容器的引用。First, get a reference to a container. 若要创建页 blob,请调用 GetPageBlobClient 方法,然后调用 PageBlobClient.Create 方法。To create a page blob, call the GetPageBlobClient method, and then call the PageBlobClient.Create method. 传入要创建的 blob 的最大大小。Pass in the max size for the blob to create. 该大小必须是 512 字节的倍数。That size must be a multiple of 512 bytes.

long OneGigabyteAsBytes = 1024 * 1024 * 1024;

BlobServiceClient blobServiceClient = new BlobServiceClient(connectionString);

var blobContainerClient =
    blobServiceClient.GetBlobContainerClient(Constants.containerName);

var pageBlobClient = blobContainerClient.GetPageBlobClient("0s4.vhd");

pageBlobClient.Create(16 * OneGigabyteAsBytes);

重设页 Blob 的大小Resizing a page blob

若要在创建后重设页 Blob 的大小,请使用 Resize 方法。To resize a page blob after creation, use the Resize method. 请求的大小应为 512 字节的倍数。The requested size should be a multiple of 512 bytes.

pageBlobClient.Resize(32 * OneGigabyteAsBytes);

将页面写入页 BlobWriting pages to a page blob

若要写入页面,请使用 PageBlobClient.UploadPages 方法。To write pages, use the PageBlobClient.UploadPages method.

pageBlobClient.UploadPages(dataStream, startingOffset);

这样,便可以写入一组有序页面(最大 4MB)。This allows you to write a sequential set of pages up to 4MBs. 写入的偏移量必须在某个 512 字节边界处 (startingOffset % 512 == 0) 开始,在某个 512 边界 - 1 处结束。The offset being written to must start on a 512-byte boundary (startingOffset % 512 == 0), and end on a 512 boundary - 1.

针对一组有序页面发出的写入请求在 Blob 服务中成功、已复制以实现持久性和复原能力后,即会提交写入,将向客户端返回成功响应。As soon as a write request for a sequential set of pages succeeds in the blob service and is replicated for durability and resiliency, the write has committed, and success is returned back to the client.

下图显示了 2 种不同的写入操作:The below diagram shows 2 separate write operations:

一个图,显示了两个不同的写入选项。

  1. 一个写入操作从长度 1024 字节的偏移 0 处开始A Write operation starting at offset 0 of length 1024 bytes
  2. 一个写入操作从长度 1024 字节的偏移 4096 处开始A Write operation starting at offset 4096 of length 1024

从页 Blob 中读取页面Reading pages from a page blob

若要读取页面,请使用 PageBlobClient.Download 方法从页 Blob 中读取某个范围的字节。To read pages, use the PageBlobClient.Download method to read a range of bytes from the page blob.

var pageBlob = pageBlobClient.Download(new HttpRange(bufferOffset, rangeSize));

这样,便可以从 Blob 中的任意偏移位置开始下载整个 Blob 或字节范围。This allows you to download the full blob or range of bytes starting from any offset in the blob. 读取时,偏移不需要在 512 的倍数位置开始。When reading, the offset does not have to start on a multiple of 512. 从 NUL 页面读取字节时,服务将返回零字节。When reading bytes from a NUL page, the service returns zero bytes.

下图显示了偏移量为 256、范围大小为 4352 的读取操作。The following figure shows a Read operation with an offset of 256 and a range size of 4352. 返回的数据以橙色突出显示。Data returned is highlighted in orange. 为 NUL 页面返回了零。Zeros are returned for NUL pages.

一个图,显示了偏移量为 256 且范围大小为 4352 的读取操作

如果使用稀疏填充的 Blob,可以只下载有效的页面区域,以避免支付零字节的传出费用,并降低下载延迟。If you have a sparsely populated blob, you may want to just download the valid page regions to avoid paying for egressing of zero bytes and to reduce download latency.

若要确定数据支持的页面,请使用 PageBlobClient.GetPageRangesTo determine which pages are backed by data, use PageBlobClient.GetPageRanges. 然后,可以枚举返回的范围,并下载每个范围中的数据。You can then enumerate the returned ranges and download the data in each range.

IEnumerable<HttpRange> pageRanges = pageBlobClient.GetPageRanges().Value.PageRanges;

foreach (var range in pageRanges)
{
    var pageBlob = pageBlobClient.Download(range);
}

租赁页 BlobLeasing a page blob

“租赁 Blob”操作在 Blob 上针对写入与删除操作建立和管理一把锁。The Lease Blob operation establishes and manages a lock on a blob for write and delete operations. 如果要从多个客户端访问页 Blob,则此操作非常有用,因为它可以确保每次只有一个客户端能够写入 Blob。This operation is useful in scenarios where a page blob is being accessed from multiple clients to ensure only one client can write to the blob at a time. 例如,Azure 磁盘利用此租赁机制来确保磁盘只能由一个 VM 管理。Azure Disks, for example, leverages this leasing mechanism to ensure the disk is only managed by a single VM. 锁的持续时间可以是 15 到 60 秒,也可以是无限期。The lock duration can be 15 to 60 seconds, or can be infinite. 有关更多详细信息,请参阅此文档See the documentation here for more details.

除了丰富的 REST API 以外,页 Blob 还提供共享访问、持久性和增强的安全性。In addition to rich REST APIs, page blobs also provide shared access, durability, and enhanced security. 后续的篇幅将更详细地介绍这些优势。We will cover those benefits in more detail in the next paragraphs.

并发访问Concurrent access

页 Blob REST API 及其租赁机制可让应用程序从多个客户端访问页 Blob。The page blobs REST API and its leasing mechanism allows applications to access the page blob from multiple clients. 例如,假设需要构建一个要与多个用户共享存储对象的分布式云服务。For example, let's say you need to build a distributed cloud service that shares storage objects with multiple users. 该服务可能是向多个用户提供大型图像集合的 Web 应用程序。It could be a web application serving a large collection of images to several users. 实现此目的的方法之一是使用包含附加磁盘的 VM。One option for implementing this is to use a VM with attached disks. 此方法的弊端包括:(i) 存在只能将一个磁盘附加到一个 VM 的约束,因此限制了可伸缩性和灵活性,并增大了风险。Downsides of this include, (i) the constraint that a disk can only be attached to a single VM thus limiting the scalability, flexibility, and increasing risks. 如果该 VM 或其上运行的服务出现问题,则由于租赁机制,只能在租约过期或中断之后,才能访问图像;(ii) 使用 IaaS VM 会产生额外的成本。If there is a problem with the VM or the service running on the VM, then due to the lease, the image is inaccessible until the lease expires or is broken; and (ii) Additional cost of having an IaaS VM.

一种替代做法是直接通过 Azure 存储 REST API 使用页 Blob。An alternative option is to use the page blobs directly via Azure Storage REST APIs. 使用此方法不会产生高昂的 IaaS VM 使用成本;可直接从多个客户端访问页 Blob,因此获得完全的灵活性;不需要附加/分离磁盘,因此简化了经典部署模型;可消除 VM 出现问题的风险。This option eliminates the need for costly IaaS VMs, offers full flexibility of direct access from multiple clients, simplifies the classic deployment model by eliminating the need to attach/detach disks, and eliminates the risk of issues on the VM. 此外,在随机读/写操作方面,此方法提供的性能级别与使用磁盘时相同And, it provides the same level of performance for random read/write operations as a disk

持续性和高可用性Durability and high availability

标准存储和高级存储都属于持久性存储,其中的页 Blob 数据始终得到复制,以确保持久性和高可用性。Both Standard and premium storage are durable storage where the page blob data is always replicated to ensure durability and high availability. 有关 Azure 存储冗余的详细信息,请参阅此文档For more information about Azure Storage Redundancy, see this documentation. Azure 为 IaaS 磁盘和页 Blob 不断提供企业级持久性,年度故障率为 0%,达到行业领先水平。Azure has consistently delivered enterprise-grade durability for IaaS disks and page blobs, with an industry-leading zero percent Annualized Failure Rate.

无缝迁移到 AzureSeamless migration to Azure

对于有意实施其自己的自定义备份解决方案的客户和开发人员,Azure 还提供只保存增量数据的增量快照。For the customers and developers who are interested in implementing their own customized backup solution, Azure also offers incremental snapshots that only hold the deltas. 此功能避免了初始完整复制的成本,可大幅降低备份成本。This feature avoids the cost of the initial full copy, which greatly lowers the backup cost. 除了有效读取和复制差异数据的功能以外,还有一个强大的功能可以进一步推动开发人员的创新,在 Azure 上带来极佳的备份和灾难恢复 (DR) 体验。Along with the ability to efficiently read and copy differential data, this is another powerful capability that enables even more innovations from developers, leading to a best-in-class backup and disaster recovery (DR) experience on Azure. 可以使用 Blob 快照为 Azure 上的 VM 设置自己的备份或灾难恢复解决方案,并可以使用获取页面范围 API 和增量复制 Blob API 轻松复制增量数据以实现灾难恢复。You can set up your own backup or DR solution for your VMs on Azure using Blob Snapshot along with the Get Page Ranges API and the Incremental Copy Blob API, which you can use for easily copying the incremental data for DR.

此外,许多企业已在本地数据中心运行关键工作负荷。Moreover, many enterprises have critical workloads already running in on-premises datacenters. 若要将工作负荷迁移到云中,一个主要的考虑因素是复制数据时需要多长时间的停机,以及在交接之后出现不可预见问题的风险。For migrating the workload to the cloud, one of the main concerns would be the amount of downtime needed for copying the data, and the risk of unforeseen issues after the switchover. 在许多情况下,停机时间可能是迁移到云的一个阻碍因素。In many cases, the downtime can be a showstopper for migration to the cloud. Azure 使用页 Blob REST API 解决了此问题,可以在几乎不会对关键工作负荷造成干扰的情况下实现云迁移。Using the page blobs REST API, Azure addresses this problem by enabling cloud migration with minimal disruption to critical workloads.

有关如何创建快照以及如何从快照还原页 Blob 的示例,请参阅使用增量快照设置备份过程一文。For examples on how to take a snapshot and how to restore a page blob from a snapshot, please refer to the setup a backup process using incremental snapshots article.