Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Azure Storage offers robust redundancy and disaster recovery capabilities through features like locally redundant storage (LRS), geo-redundant storage (GRS), and zone-redundant storage (ZRS). Understanding the nuances of planned and unplanned failover and the subsequent failback process is critical to maintaining data integrity and service availability.
This document answers frequently asked questions, describes the technical scenarios, conflicting features, and operational impacts associated with Azure Storage failover and failback. It provides guidance on best practices, highlights potential pitfalls such as data loss and unsupported configurations, and explains how to restore storage accounts to their original state after a failover event. By following these recommendations, organizations can better prepare for disaster recovery and ensure business continuity when using Azure Storage.
Unplanned Failover
You can initiate an unplanned failover to your storage account's secondary region if the data endpoints for the storage services become unavailable in the primary region. After the failover is complete, the storage account becomes Locally Redundant Storage (LRS) and the secondary region becomes the new primary. Users can proceed to access data from their new primary region.
Because data is written asynchronously from the primary region to the secondary region, there's always a delay before a write to the primary region is copied to the secondary. When an unplanned failover is initiated, all data in the primary region is lost as the secondary region becomes the new primary. All data already copied to the secondary region is maintained when the failover happens. However, any data written to the primary that doesn't yet exist within the secondary region is lost permanently. Users can utilize their Last Sync Time (LST), to confirm the last time a full sync between the primary and secondary region was completed.
To learn more, refer to the article on How unplanned failover works.
What effects will failover have on my account after it completes?
- The storage account loses geo-redundancy and becomes Locally Redundant Storage (LRS).
- The account's previous secondary region is now the primary region.
- Users might experience data loss if any writes were made to their storage account after the LST.
A summary of the effects of Unplanned Failover can be found here:
| Result of failover on... | Customer-managed (unplanned) failover |
|---|---|
| ...the secondary region | The secondary region becomes the new primary |
| ...the original primary region | The copy of the data in the original primary region is deleted |
| ...the account redundancy configuration | The storage account is converted to LRS |
| ...the geo-redundancy configuration | Geo-redundancy is lost |
The following table summarizes the resulting redundancy configuration at every stage of the failover and failback process:
| Original configuration | After failover | After re-enabling geo redundancy | After failback | After re-enabling geo redundancy |
|---|---|---|---|---|
| Customer-managed (unplanned) failover | ||||
| LRS | LRS | GRS | LRS | GRS |
| GZRS | LRS | GRS | ZRS | GZRS |
Is there data loss expected after a failover?
Users might experience data loss. Users can utilize the Last Sync Time (LST) property to determine the last time a full synchronization completed between their primary and secondary region. Any data or metadata written before the LST successfully replicates to the secondary region and will be available after the unplanned failover. However, any data or metadata written after the LST might be lost.
How long will it take to convert my account from LRS to GRS after an unplanned failover?
There's currently no service level agreement (SLA) for completion of a geo conversion, and it isn't possible to expedite this process by submitting a support request. The timeframe it takes to complete these conversions can vary depending on various factors, including:
- The number and size of the objects in the storage account.
- The available resources for background replication, such as CPU, memory, disk, and WAN capacity.
You can read more about the factors affecting SKU conversion times in the Initiate a storage account failover article. You can also learn more about changing a storage account's replication options in the Change the redundancy option for a storage account article.
Why is my account's "Location" property different than my "Primary region" after a failover?
Microsoft provides two REST APIs for working with Azure Storage resources. These APIs form the basis of all actions you can perform against Azure Storage. The Azure Storage REST API, often referred to as the data plane, enables you to work with data in your storage account, including blob, queue, file, and table data. The Azure Storage resource provider REST API, often referred to as the control plane, enables you to manage the storage account and related resources.
After a failover is complete, clients can once again read and write Azure Storage data in the new primary region. However, the Azure Storage resource provider doesn't fail over, so resource management operations must still take place in the primary region. Because the Azure Storage resource provider doesn't fail over, the Location property will return the original primary location after the failover is complete.
What is a failback?
A failback is a term we use to describe the process of utilizing a failover operation to restore the storage account to its original primary region. After a failover, the original secondary region of the GRS account becomes the new primary region. A user must initiate another failover operation in order to the restore the account back to its original primary region.
Essentially, a failback is a failover that is initiated after the original failover operation is performed on the account.
After an unplanned failover, the account becomes LRS. Failback requires a few steps:
- Convert the account from LRS to GRS. It's important to remember that the conversion from LRS to GRS doesn't have an SLA. There are also data bandwidth charges that apply when completing this conversion.
- Initiate an unplanned failover or failback.
Learn more about how to initiate an unplanned failover.
What are the conflicting features or scenarios for failovers?
Failovers carry with them a few limitations and conflicting features that users should be aware of. The following features or scenarios block a failover operation from being initiated:
- Object Replication: Attempting to initiate an unplanned failover on an account with object replication (OR) generates an error. In this case, you can delete your account's OR policies and attempt the conversion again.
Azure File Sync doesn't support customer-managed unplanned failover. Storage accounts used as cloud endpoints for Azure File Sync shouldn't be failed over. Failover disrupts file sync and might cause the unexpected data loss of newly tiered files. For more information, see Best practices for disaster recovery with Azure File Sync for details.
Next steps
As part of planning for your storage account resiliency, you can review the following articles for more information: