在 Blob 存储中管理并发Managing Concurrency in Blob storage

新型应用程序通常允许多名用户同时查看和更新数据。Modern applications often have multiple users viewing and updating data simultaneously. 应用程序开发人员需要仔细考虑如何为他们的最终用户提供可预测的体验,尤其是在多名用户可以更新相同数据的情况下。Application developers need to think carefully about how to provide a predictable experience to their end users, particularly for scenarios where multiple users can update the same data. 开发人员通常考虑下面三个主要数据并发策略:There are three main data concurrency strategies that developers typically consider:

  • 乐观并发:执行更新的应用程序将在其更新过程中确定数据是否自该应用程序上次读取此数据以来已发生更改。Optimistic concurrency: An application performing an update will, as part of its update, determine whether the data has changed since the application last read that data. 例如,如果两名查看 wiki 页面的用户对该页面进行更新,则 wiki 平台必须确保第二次更新不会覆盖第一次更新。For example, if two users viewing a wiki page make an update to that page, then the wiki platform must ensure that the second update does not overwrite the first update. 此外还必须确保两名用户都了解其更新是否成功。It must also ensure that both users understand whether their update was successful. 此策略最常用于 Web 应用程序中。This strategy is most often used in web applications.

  • 悲观并发:要执行更新的应用程序会对对象上锁,以防其他用户在该锁释放前更新数据。Pessimistic concurrency: An application looking to perform an update will take a lock on an object preventing other users from updating the data until the lock is released. 例如,在进行主/辅数据复制且只有主对象执行更新的情况下,该对象通常会长时间以独占的形式锁定数据,以确保其他任何对象都不能更新该数据。For example, in a primary/secondary data replication scenario in which only the primary performs updates, the primary typically holds an exclusive lock on the data for an extended period of time to ensure no one else can update it.

  • 以最后写入者为准:一种方法,它允许更新操作继续进行,而不需要首先确定其他应用程序是否自数据被读取以来已更新该数据。Last writer wins: An approach that allows update operations to proceed without first determining whether another application has updated the data since it was read. 当数据分区时,通常会使用此方法,这样就不可能有多个用户同时访问相同的数据。This approach is typically used when data is partitioned in such a way that multiple users will not access the same data at the same time. 该策略可能还适用于正在处理短期数据流的情况。It can also be useful where short-lived data streams are being processed.

Azure 存储支持所有三个策略,但是它在为乐观和悲观并发提供全面支持的能力方面与众不同。Azure Storage supports all three strategies, although it is distinctive in its ability to provide full support for optimistic and pessimistic concurrency. Azure 存储旨在采用强大的一致性模型,确保在服务执行插入或更新操作后,后续读取操作会返回最新更新。Azure Storage was designed to embrace a strong consistency model that guarantees that after the service performs an insert or update operation, subsequent read operations return the latest update.

除了选择相应的并发策略,开发人员还应了解存储平台如何隔离更改,尤其是跨事务对相同对象进行的更改。In addition to selecting an appropriate concurrency strategy, developers should also be aware of how a storage platform isolates changes, particularly changes to the same object across transactions. Azure 存储使用快照隔离,以允许在单个分区中并发执行读取操作与写入操作。Azure Storage uses snapshot isolation to allow read operations concurrently with write operations within a single partition. 快照隔离保证所有读取操作返回的数据快照是一致的,即使在进行更新时也是如此。Snapshot isolation guarantees that all read operations return a consistent snapshot of the data even while updates are occurring.

可以选择使用乐观并发模型或悲观并发模型,来管理对 Blob 和容器的访问。You can opt to use either optimistic or pessimistic concurrency models to manage access to blobs and containers. 如果未显式指定策略,则默认情况下以最后一次写入为准。If you don't explicitly specify a strategy, then by default the last writer wins.

乐观并发Optimistic concurrency

Azure 存储会为每个已存储的对象分配一个标识符。Azure Storage assigns an identifier to every object stored. 只要对对象执行写入操作,就会更新此标识符。This identifier is updated every time a write operation is performed on an object. 该标识符作为 HTTP GET 响应的一部分在 ETag 标头(通过 HTTP 协议定义)中返回到客户端。The identifier is returned to the client as part of an HTTP GET response in the ETag header that is defined by the HTTP protocol.

执行更新的客户端可以将原始 ETag 连同条件标头一起发送,以确保只有在满足特定条件的情况下才会进行更新。A client that is performing an update can send the original ETag together with a conditional header to ensure that an update will only occur if a certain condition has been met. 例如,如果指定了 If-Match 标头,Azure 存储会验证更新请求中指定的 ETag 的值与所更新对象的 ETag 的值是否相同。For example, if the If-Match header is specified, Azure Storage verifies that the value of the ETag specified in the update request is the same as the ETag for the object being updated. 有关条件标头的详细信息,请参阅为 Blob 服务操作指定条件标头For more information about conditional headers, see Specifying conditional headers for Blob service operations.

此进程的概述如下:The outline of this process is as follows:

  1. 从 Azure 存储检索 blob。Retrieve a blob from Azure Storage. 响应包括用于标识对象的当前版本的 HTTP ETag 标头值。The response includes an HTTP ETag Header value that identifies the current version of the object.
  2. 在更新 blob 时,应将在步骤 1 中获得的 ETag 值包括在写入请求的 If-Match 条件标头中。When you update the blob, include the ETag value you received in step 1 in the If-Match conditional header of the write request. Azure 存储会将请求中的 ETag 值与 blob 当前的 ETag 值进行比较。Azure Storage compares the ETag value in the request with the current ETag value of the blob.
  3. 如果 blob 当前的 ETag 值不同于请求中提供的 If-Match 条件标头中指定的 ETag 值,则 Azure 存储会返回 HTTP 状态代码412(“不满足前提条件”)。If the blob's current ETag value differs from that specified in the If-Match conditional header provided on the request, then Azure Storage returns HTTP status code 412 (Precondition Failed). 此错误向客户端表明,另一进程在客户端首先检索 blob 后已更新该 blob。This error indicates to the client that another process has updated the blob since the client first retrieved it.
  4. 如果 blob 的当前 ETag 值与请求的 If-Match 条件标头中的 ETag 的版本相同,则 Azure 存储会执行请求的操作,并更新该 blob 的当前 ETag 值。If the current ETag value of the blob is the same version as the ETag in the If-Match conditional header in the request, Azure Storage performs the requested operation and updates the current ETag value of the blob.

下面的代码示例演示如何在用于检查 blob 的 ETag 值的写入请求中构造 If-Match 条件。The following code examples show how to construct an If-Match condition on the write request that checks the ETag value for a blob. Azure 存储会评估 blob 的当前 ETag 是否与请求中提供的 ETag 相同,只有在两个 ETag 值匹配时才执行写入操作。Azure Storage evaluates whether the blob's current ETag is the same as the ETag provided on the request and performs the write operation only if the two ETag values match. 如果其他进程已在此期间更新该 blob,则 Azure 存储会返回 HTTP 412(“不满足前提条件”)状态消息。If another process has updated the blob in the interim, then Azure Storage returns an HTTP 412 (Precondition Failed) status message.

private static async Task DemonstrateOptimisticConcurrencyBlob(BlobClient blobClient)
{
    Console.WriteLine("Demonstrate optimistic concurrency");

    BlobContainerClient containerClient = blobClient.GetParentBlobContainerClient();

    try
    {
        // Create the container if it does not exist.
        await containerClient.CreateIfNotExistsAsync();

        // Upload text to a new block blob.
        string blobContents1 = "First update. Overwrite blob if it exists.";
        byte[] byteArray = Encoding.ASCII.GetBytes(blobContents1);

        ETag originalETag;

        using (MemoryStream stream = new MemoryStream(byteArray))
        {
            BlobContentInfo blobContentInfo = await blobClient.UploadAsync(stream, overwrite: true);
            originalETag = blobContentInfo.ETag;
            Console.WriteLine("Blob added. Original ETag = {0}", originalETag);
        }

        // This code simulates an update by another client.
        // No ETag was provided, so original blob is overwritten and ETag updated.
        string blobContents2 = "Second update overwrites first update.";
        byteArray = Encoding.ASCII.GetBytes(blobContents2);

        using (MemoryStream stream = new MemoryStream(byteArray))
        {
            BlobContentInfo blobContentInfo = await blobClient.UploadAsync(stream, overwrite: true);
            Console.WriteLine("Blob updated. Updated ETag = {0}", blobContentInfo.ETag);
        }

        // Now try to update the blob using the original ETag value.
        string blobContents3 = "Third update. If-Match condition set to original ETag.";
        byteArray = Encoding.ASCII.GetBytes(blobContents3);

        // Set the If-Match condition to the original ETag.
        BlobUploadOptions blobUploadOptions = new BlobUploadOptions()
        {
            Conditions = new BlobRequestConditions()
            {
                IfMatch = originalETag
            }
        };

        using (MemoryStream stream = new MemoryStream(byteArray))
        {
            // This call should fail with error code 412 (Precondition Failed).
            BlobContentInfo blobContentInfo = await blobClient.UploadAsync(stream, blobUploadOptions);
        }
    }
    catch (RequestFailedException e)
    {
        if (e.Status == (int)HttpStatusCode.PreconditionFailed)
        {
            Console.WriteLine(
                @"Precondition failure as expected. Blob's ETag does not match ETag provided.");
        }
        else
        {
            Console.WriteLine(e.Message);
            throw;
        }
    }
}

Azure 存储还支持条件标头,其中包括 If-Modified-Since、If-Unmodified-Since 和 If-None-Match 。Azure Storage also supports other conditional headers, including as If-Modified-Since, If-Unmodified-Since and If-None-Match. 有关详细信息,请参阅为 Blob 服务操作指定条件标头For more information, see Specifying Conditional Headers for Blob Service Operations.

Blob 的悲观并发Pessimistic concurrency for blobs

若要锁定 Blob 以供独占使用,您可以对该 Blob 获得租约。To lock a blob for exclusive use, you can acquire a lease on it. 获取租约时,可以指定租期。When you acquire the lease, you specify the duration of the lease. 有限期租约的有效期可为 15 到 60 秒。A finite lease may be valid from between 15 to 60 seconds. 租约也可以是无限期的,这相当于一个排他锁。A lease can also be infinite, which amounts to an exclusive lock. 可续订有限期租约来延长租约,也可在租约完成后将其释放。You can renew a finite lease to extend it, and you can release the lease when you're finished with it. Azure 存储在有限期租约到期时会自动释放这些租约。Azure Storage automatically releases finite leases when they expire.

租约允许各种同步策略受支持,包括独占写入/共享读取操作、独占写入/独占读取操作和共享写入/独占读取操作。Leases enable different synchronization strategies to be supported, including exclusive write/shared read operations, exclusive write/exclusive read operations, and shared write/exclusive read operations. 存在租约时,Azure 存储会对租约持有者强制执行针对写入操作的独占访问权限。When a lease exists, Azure Storage enforces exclusive access to write operations for the lease holder. 但是,若要确保读取操作的独占性,开发人员需要确保所有客户端应用程序都使用一个租约 ID,并且一次只有一个客户端具有有效的租约 ID。However, ensuring exclusivity for read operations requires the developer to make sure that all client applications use a lease ID and that only one client at a time has a valid lease ID. 不包括租约 ID 的读取操作会导致共享读取。Read operations that do not include a lease ID result in shared reads.

下面的代码示例演示了如何获取 blob 的独占租约,通过提供租约 ID 来更新 blob 的内容,然后释放租约。The following code examples show how to acquire an exclusive lease on a blob, update the content of the blob by providing the lease ID, and then release the lease. 如果租约有效,但写入请求中未提供租约 ID,则写入操作会失败,并出现错误代码 412(“不满足前提条件”)。If the lease is active and the lease ID is not provided on a write request, then the write operation fails with error code 412 (Precondition Failed).

public static async Task DemonstratePessimisticConcurrencyBlob(BlobClient blobClient)
{
    Console.WriteLine("Demonstrate pessimistic concurrency");

    BlobContainerClient containerClient = blobClient.GetParentBlobContainerClient();
    BlobLeaseClient blobLeaseClient = blobClient.GetBlobLeaseClient();

    try
    {
        // Create the container if it does not exist.
        await containerClient.CreateIfNotExistsAsync();

        // Upload text to a blob.
        string blobContents1 = "First update. Overwrite blob if it exists.";
        byte[] byteArray = Encoding.ASCII.GetBytes(blobContents1);
        using (MemoryStream stream = new MemoryStream(byteArray))
        {
            BlobContentInfo blobContentInfo = await blobClient.UploadAsync(stream, overwrite: true);
        }

        // Acquire a lease on the blob.
        BlobLease blobLease = await blobLeaseClient.AcquireAsync(TimeSpan.FromSeconds(15));
        Console.WriteLine("Blob lease acquired. LeaseId = {0}", blobLease.LeaseId);

        // Set the request condition to include the lease ID.
        BlobUploadOptions blobUploadOptions = new BlobUploadOptions()
        {
            Conditions = new BlobRequestConditions()
            {
                LeaseId = blobLease.LeaseId
            }
        };

        // Write to the blob again, providing the lease ID on the request.
        // The lease ID was provided, so this call should succeed.
        string blobContents2 = "Second update. Lease ID provided on request.";
        byteArray = Encoding.ASCII.GetBytes(blobContents2);

        using (MemoryStream stream = new MemoryStream(byteArray))
        {
            BlobContentInfo blobContentInfo = await blobClient.UploadAsync(stream, blobUploadOptions);
        }

        // This code simulates an update by another client.
        // The lease ID is not provided, so this call fails.
        string blobContents3 = "Third update. No lease ID provided.";
        byteArray = Encoding.ASCII.GetBytes(blobContents3);

        using (MemoryStream stream = new MemoryStream(byteArray))
        {
            // This call should fail with error code 412 (Precondition Failed).
            BlobContentInfo blobContentInfo = await blobClient.UploadAsync(stream);
        }
    }
    catch (RequestFailedException e)
    {
        if (e.Status == (int)HttpStatusCode.PreconditionFailed)
        {
            Console.WriteLine(
                @"Precondition failure as expected. The lease ID was not provided.");
        }
        else
        {
            Console.WriteLine(e.Message);
            throw;
        }
    }
    finally
    {
        await blobLeaseClient.ReleaseAsync();
    }
}

容器的悲观并发Pessimistic concurrency for containers

容器的租约允许 blob 支持的那些同步策略,包括独占写入/共享读取、独占写入/独占读取和共享写入/独占读取。Leases on containers enable the same synchronization strategies that are supported for blobs, including exclusive write/shared read, exclusive write/exclusive read, and shared write/exclusive read. 但是,对于容器,只会对删除操作强制执行排他锁。For containers, however, the exclusive lock is enforced only on delete operations. 要删除具有活动租约的容器,客户端必须将活动租约 ID 包括在删除请求中。To delete a container with an active lease, a client must include the active lease ID with the delete request. 在没有租约 ID 的情况下,对租赁容器的所有其他容器操作都会成功。All other container operations will succeed on a leased container without the lease ID.

后续步骤Next steps