Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Applies to:
Azure SQL Database
This article teaches you how to configure a failover group for single and pooled databases in Azure SQL Database by using the Azure portal, Azure PowerShell, and the Azure CLI.
For end-to-end scripts, review how to add a single database to a failover group with Azure PowerShell or the Azure CLI.
Consider the following prerequisites to create your failover group for a single database:
- Your primary database should already be created. Create single database to get started.
- If your secondary server already exists in a different region to the primary server, the server login and firewall settings must match that of your primary server.
You can create your failover group and add a single database to it by using the Azure portal, PowerShell, and the Azure CLI.
Important
If you need to delete a secondary database after it's been added to a failover group, remove it from the failover group before deleting the database. Deleting a secondary database before it is removed from the failover group can cause unpredictable behavior.
Create your failover group and add databases to it with the az sql failover-group create Azure CLI command:
echo "Creating $failoverGroup between $server and $secondaryServer..."
az sql failover-group create --name $failoverGroup --partner-server $secondaryServer --resource-group $resourceGroup --server $server --failover-policy Automatic --grace-period 2 --add-db $database
Test failover of your failover group with no data loss using the Azure portal or PowerShell.
Test failover by using the Azure CLI.
Use the az sql failover-group show command to confirm the roles of each server.
echo "Confirming the role of each server in the failover group..." # note ReplicationRole property
az sql failover-group show --name $failoverGroup --resource-group $resourceGroup --server $server
Use the az sql failover-group set-primary to fail over to the secondary server. Use the az sql failover-group show command to verify a successful failover.
echo "Failing over to $secondaryServer..."
az sql failover-group set-primary --name $failoverGroup --resource-group $resourceGroup --server $secondaryServer
echo "Confirming role of $secondaryServer is now primary..." # note ReplicationRole property
az sql failover-group show --name $failoverGroup --resource-group $resourceGroup --server $server
Use the az sql failover-group set-primary command to fail back to the primary server.
echo "Failing back to $server...."
az sql failover-group set-primary --name $failoverGroup --resource-group $resourceGroup --server $server
For end-to-end scripts, review how to add an elastic pool to a failover group with Azure PowerShell or the Azure CLI.
Consider the following prerequisites for creating your failover group for a pooled database:
- Your primary elastic pool should already exist. Create elastic pool to get started.
- If your secondary server already exists, the server login and firewall settings must match that of your primary server.
Create the failover group for your elastic pool by using the Azure portal, PowerShell, or the Azure CLI.
Important
If you need to delete a secondary database after its been added to a failover group, remove it from the failover group before deleting the database. Deleting a secondary database before it is removed from the failover group can cause unpredictable behavior.
In this step, create your failover group and add your database to it by using the Azure CLI.
Use the az sql failover-group create command to create a failover group.
echo "Creating $failoverGroup between $server and $secondaryServer..."
az sql failover-group create --name $failoverGroup --partner-server $secondaryServer --resource-group $resourceGroup --server $server --failover-policy Automatic --grace-period 2
databaseId=$(az sql elastic-pool list-dbs --name $pool --resource-group $resourceGroup --server $server --query [0].name -o json | tr -d '"')
Use the az sql failover-group update command to add a database to the failover group.
echo "Adding $database to $failoverGroup..."
az sql failover-group update --name $failoverGroup --add-db $databaseId --resource-group $resourceGroup --server $server
Test failover of your elastic pool with no data loss by using the Azure portal, PowerShell, or the Azure CLI.
Test failover using the Azure CLI.
Use the az sql failover-group show command to confirm the roles of each server in the failover group.
echo "Confirming the role of each server in the failover group..." # note ReplicationRole property
az sql failover-group show --name $failoverGroup --resource-group $resourceGroup --server $server
Use the az sql failover-group set-primary command to fail over to the secondary server. Use the az sql failover-group show command to verify a successful failover.
echo "Failing over to $secondaryServer..."
az sql failover-group set-primary --name $failoverGroup --resource-group $resourceGroup --server $secondaryServer
echo "Confirming role of $secondaryServer is now primary..." # note ReplicationRole property
az sql failover-group show --name $failoverGroup --resource-group $resourceGroup --server $server
Use the az sql failover-group show command to confirm the roles of each server in the failover group.
echo "Confirming the role of each server in the failover group..." # note ReplicationRole property
az sql failover-group show --name $failoverGroup --resource-group $resourceGroup --server $server
Use the az sql failover-group set-primary command to fail back to the primary server.
echo "Failing back to $server...."
az sql failover-group set-primary --name $failoverGroup --resource-group $resourceGroup --server $server
You can add or remove databases from an existing failover group, or edit failover group configuration settings by using the Azure portal, PowerShell, and the Azure CLI.
To modify an existing failover group by using the Azure CLI, use the az sql failover-group update command to add or remove databases, or modify configuration settings.
Using a private link allows you to associate a logical server to a specific private IP address within the virtual network and subnet.
To use a private link with your failover group, do the following:
- Ensure your primary and secondary servers are in a paired region.
- Create the virtual network and subnet in each region to host private endpoints for primary and secondary servers such that they have nonoverlapping IP address spaces. For example, the primary virtual network address range of 10.0.0.0/16 and the secondary virtual network address range of 10.0.0.1/16 overlaps. For more information about virtual network address ranges, see the blog designing Azure virtual networks.
- Create a private endpoint and Azure Private DNS zone for the primary server.
- Create a private endpoint for the secondary server as well, but this time choose to reuse the same Private DNS zone that was created for the primary server.
- Once the private link is established, you can create the failover group following the steps outlined previously in this article.
After your failover group is configured, update the connection string for your application to point to the Read/write listener endpoint so that your application continues to connect to whichever database is primary after failover. By using the listener endpoint, you don't have to manually update your connection string every time your failover group fails over since traffic is always routed to the current primary. You can also point read-only workload to the Read-only listener endpoint.
To locate the listener endpoint in the Azure portal, go to your logical server in the Azure portal and under Data management, select Failover groups. Select the failover group you're interested in.
Scroll down to find the listener endpoints:
- The Read/write listener endpoint, in the form of
fog-name.database.chinacloudapi.cn
, routes traffic to the primary database. - The Read-only listener endpoint, in the form of
fog-name.secondary.database.chinacloudapi.cn
, routes traffic to the secondary database.
You can scale the primary database up or down to a different compute size (within the same service tier) without disconnecting any geo-secondaries. When scaling up, we recommend that you scale up the geo-secondary first, and then scale up the primary. When scaling down, reverse the order: scale down the primary first, and then scale down the secondary. When you scale a database to a different service tier, this recommendation is enforced.
This sequence is recommended specifically to avoid the problem where the geo-secondary at a lower SKU gets overloaded and must be reseeded during an upgrade or downgrade process. You could also avoid the problem by making the primary read-only, at the expense of affecting all read-write workloads against the primary.
Note
If you created a geo-secondary as part of the failover group configuration, it's not recommended to scale down the geo-secondary. This is to ensure your data tier has sufficient capacity to process your regular workload after a geo-failover. You might not be able to scale a geo-secondary after an unplanned failover when the former geo-primary is unavailable due to outage. This is a known limitation.
The primary database in a failover group can't scale to a higher service tier (edition) unless the secondary database is first scaled to the higher tier. For example, if you want to scale up the primary from General Purpose to Business Critical, you have to first scale the geo-secondary to Business Critical. If you try to scale the primary or geo-secondary in a way that violates this rule, you will receive the following error:
The source database 'Primaryserver.DBName' cannot have higher edition than the target database 'Secondaryserver.DBName'. Upgrade the edition on the target before upgrading the source.
Due to the high latency of wide area networks, geo-replication uses an asynchronous replication mechanism. Asynchronous replication makes the possibility of data loss unavoidable if the primary fails. To protect critical transactions from data loss, an application developer can call the sp_wait_for_database_copy_sync stored procedure immediately after committing the transaction. Calling sp_wait_for_database_copy_sync
blocks the calling thread until the last committed transaction has been transmitted and hardened in the transaction log of the secondary database. However, it doesn't wait for the transmitted transactions to be replayed (redone) on the secondary. sp_wait_for_database_copy_sync
is scoped to a specific geo-replication link. Any user with the connection rights to the primary database can call this procedure.
Note
sp_wait_for_database_copy_sync
prevents data loss after geo-failover for specific transactions, but does not guarantee full synchronization for read access. The delay caused by a sp_wait_for_database_copy_sync
procedure call can be significant and depends on the size of the not yet transmitted transaction log on the primary at the time of the call.
To illustrate the change sequence, we'll assume that server A is the primary server, server B is the existing secondary server, and server C is the new secondary in the third region. To make the transition, follow these steps:
- Create additional secondaries of each database on server A to server C using active geo-replication. Each database on server A will have two secondaries, one on server B and one on server C. This guarantees that the primary databases remain protected during the transition.
- Delete the failover group. At this point sign in attempts using failover group endpoints start to fail.
- Re-create the failover group with the same name between servers A and C.
- Add all primary databases on server A to the new failover group. At this point sign in attempts stop failing.
- Delete server B. All databases on B will be deleted automatically.
To illustrate the change sequence, we'll assume server A is the primary server, server B is the existing secondary server, and server C is the new primary in the third region. To make the transition, follow these steps:
- Perform a planned geo-failover to switch the primary server to B. Server A becomes the new secondary server. The failover might result in several minutes of downtime. The actual time depends on the size of failover group.
- Create additional secondaries of each database on server B to server C using active geo-replication. Each database on server B will have two secondaries, one on server A and one on server C. This guarantees that the primary databases remain protected during the transition.
- Delete the failover group. At this point sign in attempts using failover group endpoints start to fail.
- Re-create the failover group with the same name between servers B and C.
- Add all primary databases on B to the new failover group. At this point login attempts stop failing.
- Perform a planned geo-failover of the failover group to switch B and C. Now server C becomes the primary and B the secondary. All secondary databases on server A will be automatically linked to the primaries on C. As in step 1, the failover might result in several minutes of downtime.
- Delete server A. All databases on A will be deleted automatically.
Important
When the failover group is deleted, the DNS records for the listener endpoints are also deleted. At that point, there is a non-zero probability of somebody else creating a failover group or a server DNS alias with the same name. Because failover group names and DNS aliases must be globally unique, this will prevent you from using the same name again. To minimize this risk, don't use generic failover group names.
For some applications, the security rules require that the network access to the data tier is restricted to a specific component or components such as a VM, web service, etc. This requirement presents some challenges for business continuity design and the use of failover groups. Consider the following options when implementing such restricted access.
If you're using Virtual Network service endpoints and rules to restrict access to your database, be aware that each virtual network service endpoint applies to only one Azure region. The endpoint doesn't enable other regions to accept communication from the subnet. Therefore, only the client applications deployed in the same region can connect to the primary database. Since a geo-failover results in the SQL Database client sessions being rerouted to a server in a different (secondary) region, these sessions might fail if originated from a client outside of that region. For that reason, the Azure managed failover policy can't be enabled if the participating servers are included in the Virtual Network rules. To support manual failover policy, follow these steps:
- Provision redundant copies of the frontend components of your application (web service, virtual machines etc.) in the secondary region.
- Configure virtual network rules individually for the primary and secondary server.
- Enable frontend failover using a Traffic manager configuration.
- Initiate a manual geo-failover when the outage is detected. This option is optimized for applications that require consistent latency between the frontend and the data tier and supports recovery when either frontend, data tier, or both are affected by the outage.
Note
If you're using the read-only listener to load-balance a read-only workload, make sure this workload is executed in a VM or other resource in the secondary region so it can connect to the secondary database.
If your business continuity plan requires failover using failover groups, you can restrict access to your SQL Database by using public IP firewall rules. This configuration ensures that a geo-failover won't block connections from frontend components and assumes that the application can tolerate the longer latency between the frontend and the data tier.
To support failover group failover, follow these steps:
- Create a public IP.
- Create a public load balancer and assign the public IP to it.
- Create a virtual network and the virtual machines for your front-end components.
- Create network security group and configure inbound connections.
- Ensure that the outbound connections are open to Azure SQL Database in a region by using an
Sql.<Region>
service tag. - Create a SQL Database firewall rule to allow inbound traffic from the public IP address you create in step 1.
For more information on how to configure outbound access and what IP to use in the firewall rules, see Load balancer outbound connections.
Important
To guarantee business continuity during regional outages you must ensure geographic redundancy for both frontend components and databases.
Permissions for a failover group are managed via Azure role-based access control (Azure RBAC).
Azure RBAC write access is necessary to create and manage failover groups. The SQL Server Contributor role has all the necessary permissions to manage failover groups.
The following table lists specific permission scopes for Azure SQL Database:
Action | Permission | Scope |
---|---|---|
Create failover group | Azure RBAC write access | Primary server Secondary server All databases in failover group |
Update failover group | Azure RBAC write access | Failover group All databases on the current primary server |
Fail over failover group | Azure RBAC write access | Failover group on new server |
Be aware of the following limitations:
- Failover groups can't be created between two servers in the same Azure region.
- Failover groups support geo-replication of all databases in the group to only one secondary logical server in a different region.
- Failover groups can't be renamed. You'll need to delete the group and re-create it with a different name.
- Database rename isn't supported for databases in a failover group. You'll need to temporarily delete the failover group to be able to rename a database, or remove the database, from the failover group.
- Removing a failover group for a single or pooled database doesn't stop replication, and it doesn't delete the replicated database. You'll need to manually stop geo-replication and delete the database from the secondary server if you wanted to add a single or pooled database back to a failover group after it's been removed. Failing to do either might result in an error similar to
The operation cannot be performed due to multiple errors
when attempting to add the database to the failover group. - Failover group name is subject to naming restrictions.
- When creating a new failover group, or when adding databases to an existing failover group, you can only designate the databases as standby replicas when using the Azure portal - Azure PowerShell and the Azure CLI are not currently available.
Failover groups can also be managed programmatically by using Azure PowerShell, Azure CLI, and REST API. The following tables describe the set of commands available. Failover groups include a set of Azure Resource Manager APIs for management, including the Azure SQL Database REST API and Azure PowerShell cmdlets. These APIs require the use of resource groups and support Azure role-based access control (Azure RBAC). For more information on how to implement access roles, see Azure role-based access control (Azure RBAC).
Cmdlet | Description |
---|---|
New-AzSqlDatabaseFailoverGroup | This command creates a failover group and registers it on both primary and secondary servers |
Add-AzSqlDatabaseToFailoverGroup | Adds one or more databases to a failover group |
Remove-AzSqlDatabaseFromFailoverGroup | Removes one or more databases from a failover group |
Remove-AzSqlDatabaseFailoverGroup | Removes a failover group from the server |
Get-AzSqlDatabaseFailoverGroup | Retrieves a failover group's configuration |
Set-AzSqlDatabaseFailoverGroup | Modifies configuration of a failover group |
Switch-AzSqlDatabaseFailoverGroup | Triggers failover of a failover group to the secondary server |
Note
It's possible to deploy your failover group across subscriptions by using the -PartnerSubscriptionId
parameter in Azure Powershell starting with Az.SQL 3.11.0. To learn more, review the following Example.