Configure a failover group for Azure SQL Database

Applies to: Azure SQL Database

This article teaches you how to configure a failover group for single and pooled databases in Azure SQL Database by using the Azure portal, Azure PowerShell, and the Azure CLI.

For end-to-end scripts, review how to add a single database to a failover group with Azure PowerShell or the Azure CLI.

Prerequisites

Consider the following prerequisites to create your failover group for a single database:

  • Your primary database should already be created. Create single database to get started.
  • If your secondary server already exists in a different region to the primary server, the server login and firewall settings must match that of your primary server.

Create failover group

You can create your failover group and add a single database to it by using the Azure portal, PowerShell, and the Azure CLI.

Important

If you need to delete a secondary database after it's been added to a failover group, remove it from the failover group before deleting the database. Deleting a secondary database before it is removed from the failover group can cause unpredictable behavior.

Create your failover group with the New-AzSqlDatabaseFailoverGroup PowerShell command:

# Create a failover group between the servers
$failovergroup = Write-host "Creating a failover group between the primary and secondary server..."
New-AzSqlDatabaseFailoverGroup -ResourceGroupName $resourceGroupName -ServerName $serverName -PartnerServerName $drServerName -FailoverGroupName $failoverGroupName -FailoverPolicy Manual
$failovergroup

Use the Get-AzSqlDatabase and Add-AzSqlDatabaseToFailoverGroup commands to add the database to the failover group:

# Add the database to the failover group
Write-host "Adding the database to the failover group..." 
Get-AzSqlDatabase -ResourceGroupName $resourceGroupName -ServerName $serverName -DatabaseName $databaseName | Add-AzSqlDatabaseToFailoverGroup -ResourceGroupName $resourceGroupName -ServerName $serverName -FailoverGroupName $failoverGroupName
Write-host "Successfully added the database to the failover group..."

Note

It's possible to deploy your failover group across subscriptions by using the -PartnerSubscriptionId parameter in Azure Powershell starting with Az.SQL 3.11.0. To learn more, review the following Example.

Test planned failover

Test failover of your failover group with no data loss using the Azure portal or PowerShell.

Test failover of your failover group by using PowerShell.

Verify the roles of each server

Use the Get-AzSqlDatabaseFailoverGroup command to confirm the roles of each server.

# Check role of secondary replica
Write-host "Confirming the secondary replica is secondary...." 
(Get-AzSqlDatabaseFailoverGroup -FailoverGroupName $failoverGroupName -ResourceGroupName $resourceGroupName -ServerName $drServerName).ReplicationRole

Fail over to the secondary server

Use the Switch-AzSqlDatabaseFailoverGroup to fail over to the secondary server.

# Failover to secondary server
Write-host "Failing over failover group to the secondary..." 
Switch-AzSqlDatabaseFailoverGroup -ResourceGroupName $resourceGroupName -ServerName $drServerName -FailoverGroupName $failoverGroupName
Write-host "Failed failover group successfully to" $drServerName

Write-host "Confirming the secondary server is now primary...." 
(Get-AzSqlDatabaseFailoverGroup -FailoverGroupName $failoverGroupName -ResourceGroupName $resourceGroupName -ServerName $drServerName).ReplicationRole

Revert failover group back to the primary server

Use the Switch-AzSqlDatabaseFailoverGroup command to fail back to the primary server.

# Revert failover to primary server
Write-host "Failing over failover group to the primary...." 
Switch-AzSqlDatabaseFailoverGroup -ResourceGroupName $resourceGroupName -ServerName $serverName -FailoverGroupName $failoverGroupName
Write-host "Failed failover group successfully back to" $serverName

For end-to-end scripts, review how to add an elastic pool to a failover group with Azure PowerShell or the Azure CLI.

Prerequisites

Consider the following prerequisites for creating your failover group for a pooled database:

  • Your primary elastic pool should already exist. Create elastic pool to get started.
  • If your secondary server already exists, the server login and firewall settings must match that of your primary server.

Create failover group

Create the failover group for your elastic pool by using the Azure portal, PowerShell, or the Azure CLI.

Important

If you need to delete a secondary database after its been added to a failover group, remove it from the failover group before deleting the database. Deleting a secondary database before it is removed from the failover group can cause unpredictable behavior.

Create your failover group and add your elastic pool to it by using PowerShell.

Use the New-AzSqlDatabaseFailoverGroup to create the failover group:

# Create a failover group between the servers
Write-host "Creating failover group..." 
New-AzSqlDatabaseFailoverGroup -ResourceGroupName $resourceGroupName -ServerName $serverName -PartnerServerName $drServerName -FailoverGroupName $failoverGroupName -FailoverPolicy Manual
Write-host "Failover group created successfully."

Use the Get-AzSqlDatabaseFailoverGroup and Get-AzSqlElasticPoolDatabase commands to add the database to the failover group:

# Add elastic pool to the failover group
Write-host "Enumerating databases in elastic pool...." 
$FailoverGroup = Get-AzSqlDatabaseFailoverGroup -ResourceGroupName $resourceGroupName -ServerName $serverName -FailoverGroupName $failoverGroupName
$databases = Get-AzSqlElasticPoolDatabase -ResourceGroupName $resourceGroupName -ServerName $serverName -ElasticPoolName $poolName
Write-host "Adding databases to failover group..." 
$failoverGroup = $failoverGroup | Add-AzSqlDatabaseToFailoverGroup -Database $databases 
$failoverGroup

Test planned failover

Test failover of your elastic pool with no data loss by using the Azure portal, PowerShell, or the Azure CLI.

Test failover of your failover group using PowerShell by using the following PowerShell commands:

Verify the roles of each server

Use the Get-AzSqlDatabaseFailoverGroup command to confirm the roles of each server.

# Check role of secondary replica
Write-host "Confirming the secondary server is secondary...." 
(Get-AzSqlDatabaseFailoverGroup -FailoverGroupName $failoverGroupName -ResourceGroupName $resourceGroupName -ServerName $drServerName).ReplicationRole

Fail over to the secondary server

Use the Switch-AzSqlDatabaseFailoverGroup to fail over to the secondary server.

# Failover to secondary server
Write-host "Failing over failover group to the secondary..." 
Switch-AzSqlDatabaseFailoverGroup -ResourceGroupName $resourceGroupName -ServerName $drServerName -FailoverGroupName $failoverGroupName
Write-host "Failover group failed over to" $drServerName

# Check role of secondary replica
Write-host "Confirming the secondary server is now primary" 
(Get-AzSqlDatabaseFailoverGroup -FailoverGroupName $failoverGroupName -ResourceGroupName $resourceGroupName -ServerName $drServerName).ReplicationRole

Verify the roles of each server

Use the Get-AzSqlDatabaseFailoverGroup command to confirm the roles of each server.

# Check role of secondary replica
Write-host "Confirming the secondary server is secondary...." 
(Get-AzSqlDatabaseFailoverGroup -FailoverGroupName $failoverGroupName -ResourceGroupName $resourceGroupName -ServerName $drServerName).ReplicationRole

Revert failover group back to the primary server

Use the Switch-AzSqlDatabaseFailoverGroup command to fail back to the primary server.

# Revert failover to primary server
Write-host "Failing over failover group to the primary...." 
Switch-AzSqlDatabaseFailoverGroup -ResourceGroupName $resourceGroupName -ServerName $serverName -FailoverGroupName $failoverGroupName
Write-host "Failover group failed over to" $serverName

Modify existing failover group

You can add or remove databases from an existing failover group, or edit failover group configuration settings by using the Azure portal, PowerShell, and the Azure CLI.

To modify an existing failover group by using PowerShell:

Using a private link allows you to associate a logical server to a specific private IP address within the virtual network and subnet.

To use a private link with your failover group, do the following:

  1. Ensure your primary and secondary servers are in a paired region.
  2. Create the virtual network and subnet in each region to host private endpoints for primary and secondary servers such that they have nonoverlapping IP address spaces. For example, the primary virtual network address range of 10.0.0.0/16 and the secondary virtual network address range of 10.0.0.1/16 overlaps. For more information about virtual network address ranges, see the blog designing Azure virtual networks.
  3. Create a private endpoint and Azure Private DNS zone for the primary server.
  4. Create a private endpoint for the secondary server as well, but this time choose to reuse the same Private DNS zone that was created for the primary server.
  5. Once the private link is established, you can create the failover group following the steps outlined previously in this article.

Locate listener endpoint

After your failover group is configured, update the connection string for your application to point to the Read/write listener endpoint so that your application continues to connect to whichever database is primary after failover. By using the listener endpoint, you don't have to manually update your connection string every time your failover group fails over since traffic is always routed to the current primary. You can also point read-only workload to the Read-only listener endpoint.

To locate the listener endpoint in the Azure portal, go to your logical server in the Azure portal and under Data management, select Failover groups. Select the failover group you're interested in.

Scroll down to find the listener endpoints:

  • The Read/write listener endpoint, in the form of fog-name.database.chinacloudapi.cn, routes traffic to the primary database.
  • The Read-only listener endpoint, in the form of fog-name.secondary.database.chinacloudapi.cn, routes traffic to the secondary database.

Screenshot showing the failover group connection string on the Failover groups page in the Azure portal.

Scaling databases in a failover group

You can scale the primary database up or down to a different compute size (within the same service tier) without disconnecting any geo-secondaries. When scaling up, we recommend that you scale up the geo-secondary first, and then scale up the primary. When scaling down, reverse the order: scale down the primary first, and then scale down the secondary. When you scale a database to a different service tier, this recommendation is enforced.

This sequence is recommended specifically to avoid the problem where the geo-secondary at a lower SKU gets overloaded and must be reseeded during an upgrade or downgrade process. You could also avoid the problem by making the primary read-only, at the expense of affecting all read-write workloads against the primary.

Note

If you created a geo-secondary as part of the failover group configuration, it's not recommended to scale down the geo-secondary. This is to ensure your data tier has sufficient capacity to process your regular workload after a geo-failover. You might not be able to scale a geo-secondary after an unplanned failover when the former geo-primary is unavailable due to outage. This is a known limitation.

The primary database in a failover group can't scale to a higher service tier (edition) unless the secondary database is first scaled to the higher tier. For example, if you want to scale up the primary from General Purpose to Business Critical, you have to first scale the geo-secondary to Business Critical. If you try to scale the primary or geo-secondary in a way that violates this rule, you will receive the following error:

The source database 'Primaryserver.DBName' cannot have higher edition than the target database 'Secondaryserver.DBName'. Upgrade the edition on the target before upgrading the source.

Prevent loss of critical data

Due to the high latency of wide area networks, geo-replication uses an asynchronous replication mechanism. Asynchronous replication makes the possibility of data loss unavoidable if the primary fails. To protect critical transactions from data loss, an application developer can call the sp_wait_for_database_copy_sync stored procedure immediately after committing the transaction. Calling sp_wait_for_database_copy_sync blocks the calling thread until the last committed transaction has been transmitted and hardened in the transaction log of the secondary database. However, it doesn't wait for the transmitted transactions to be replayed (redone) on the secondary. sp_wait_for_database_copy_sync is scoped to a specific geo-replication link. Any user with the connection rights to the primary database can call this procedure.

Note

sp_wait_for_database_copy_sync prevents data loss after geo-failover for specific transactions, but does not guarantee full synchronization for read access. The delay caused by a sp_wait_for_database_copy_sync procedure call can be significant and depends on the size of the not yet transmitted transaction log on the primary at the time of the call.

Change the secondary region

To illustrate the change sequence, we'll assume that server A is the primary server, server B is the existing secondary server, and server C is the new secondary in the third region. To make the transition, follow these steps:

  1. Create additional secondaries of each database on server A to server C using active geo-replication. Each database on server A will have two secondaries, one on server B and one on server C. This guarantees that the primary databases remain protected during the transition.
  2. Delete the failover group. At this point sign in attempts using failover group endpoints start to fail.
  3. Re-create the failover group with the same name between servers A and C.
  4. Add all primary databases on server A to the new failover group. At this point sign in attempts stop failing.
  5. Delete server B. All databases on B will be deleted automatically.

Change the primary region

To illustrate the change sequence, we'll assume server A is the primary server, server B is the existing secondary server, and server C is the new primary in the third region. To make the transition, follow these steps:

  1. Perform a planned geo-failover to switch the primary server to B. Server A becomes the new secondary server. The failover might result in several minutes of downtime. The actual time depends on the size of failover group.
  2. Create additional secondaries of each database on server B to server C using active geo-replication. Each database on server B will have two secondaries, one on server A and one on server C. This guarantees that the primary databases remain protected during the transition.
  3. Delete the failover group. At this point sign in attempts using failover group endpoints start to fail.
  4. Re-create the failover group with the same name between servers B and C.
  5. Add all primary databases on B to the new failover group. At this point login attempts stop failing.
  6. Perform a planned geo-failover of the failover group to switch B and C. Now server C becomes the primary and B the secondary. All secondary databases on server A will be automatically linked to the primaries on C. As in step 1, the failover might result in several minutes of downtime.
  7. Delete server A. All databases on A will be deleted automatically.

Important

When the failover group is deleted, the DNS records for the listener endpoints are also deleted. At that point, there is a non-zero probability of somebody else creating a failover group or a server DNS alias with the same name. Because failover group names and DNS aliases must be globally unique, this will prevent you from using the same name again. To minimize this risk, don't use generic failover group names.

Failover groups and network security

For some applications, the security rules require that the network access to the data tier is restricted to a specific component or components such as a VM, web service, etc. This requirement presents some challenges for business continuity design and the use of failover groups. Consider the following options when implementing such restricted access.

Use failover groups and virtual network service endpoints

If you're using Virtual Network service endpoints and rules to restrict access to your database, be aware that each virtual network service endpoint applies to only one Azure region. The endpoint doesn't enable other regions to accept communication from the subnet. Therefore, only the client applications deployed in the same region can connect to the primary database. Since a geo-failover results in the SQL Database client sessions being rerouted to a server in a different (secondary) region, these sessions might fail if originated from a client outside of that region. For that reason, the Azure managed failover policy can't be enabled if the participating servers are included in the Virtual Network rules. To support manual failover policy, follow these steps:

  1. Provision redundant copies of the frontend components of your application (web service, virtual machines etc.) in the secondary region.
  2. Configure virtual network rules individually for the primary and secondary server.
  3. Enable frontend failover using a Traffic manager configuration.
  4. Initiate a manual geo-failover when the outage is detected. This option is optimized for applications that require consistent latency between the frontend and the data tier and supports recovery when either frontend, data tier, or both are affected by the outage.

Note

If you're using the read-only listener to load-balance a read-only workload, make sure this workload is executed in a VM or other resource in the secondary region so it can connect to the secondary database.

Use failover groups and firewall rules

If your business continuity plan requires failover using failover groups, you can restrict access to your SQL Database by using public IP firewall rules. This configuration ensures that a geo-failover won't block connections from frontend components and assumes that the application can tolerate the longer latency between the frontend and the data tier.

To support failover group failover, follow these steps:

  1. Create a public IP.
  2. Create a public load balancer and assign the public IP to it.
  3. Create a virtual network and the virtual machines for your front-end components.
  4. Create network security group and configure inbound connections.
  5. Ensure that the outbound connections are open to Azure SQL Database in a region by using an Sql.<Region> service tag.
  6. Create a SQL Database firewall rule to allow inbound traffic from the public IP address you create in step 1.

For more information on how to configure outbound access and what IP to use in the firewall rules, see Load balancer outbound connections.

Important

To guarantee business continuity during regional outages you must ensure geographic redundancy for both frontend components and databases.

Permissions

Permissions for a failover group are managed via Azure role-based access control (Azure RBAC).

Azure RBAC write access is necessary to create and manage failover groups. The SQL Server Contributor role has all the necessary permissions to manage failover groups.

The following table lists specific permission scopes for Azure SQL Database:

Action Permission Scope
Create failover group Azure RBAC write access Primary server
Secondary server
All databases in failover group
Update failover group Azure RBAC write access Failover group
All databases on the current primary server
Fail over failover group Azure RBAC write access Failover group on new server

Limitations

Be aware of the following limitations:

  • Failover groups can't be created between two servers in the same Azure region.
  • Failover groups support geo-replication of all databases in the group to only one secondary logical server in a different region.
  • Failover groups can't be renamed. You'll need to delete the group and re-create it with a different name.
  • Database rename isn't supported for databases in a failover group. You'll need to temporarily delete the failover group to be able to rename a database, or remove the database, from the failover group.
  • Removing a failover group for a single or pooled database doesn't stop replication, and it doesn't delete the replicated database. You'll need to manually stop geo-replication and delete the database from the secondary server if you wanted to add a single or pooled database back to a failover group after it's been removed. Failing to do either might result in an error similar to The operation cannot be performed due to multiple errors when attempting to add the database to the failover group.
  • Failover group name is subject to naming restrictions.
  • When creating a new failover group, or when adding databases to an existing failover group, you can only designate the databases as standby replicas when using the Azure portal - Azure PowerShell and the Azure CLI are not currently available.

Programmatically manage failover groups

Failover groups can also be managed programmatically by using Azure PowerShell, Azure CLI, and REST API. The following tables describe the set of commands available. Failover groups include a set of Azure Resource Manager APIs for management, including the Azure SQL Database REST API and Azure PowerShell cmdlets. These APIs require the use of resource groups and support Azure role-based access control (Azure RBAC). For more information on how to implement access roles, see Azure role-based access control (Azure RBAC).

Cmdlet Description
New-AzSqlDatabaseFailoverGroup This command creates a failover group and registers it on both primary and secondary servers
Add-AzSqlDatabaseToFailoverGroup Adds one or more databases to a failover group
Remove-AzSqlDatabaseFromFailoverGroup Removes one or more databases from a failover group
Remove-AzSqlDatabaseFailoverGroup Removes a failover group from the server
Get-AzSqlDatabaseFailoverGroup Retrieves a failover group's configuration
Set-AzSqlDatabaseFailoverGroup Modifies configuration of a failover group
Switch-AzSqlDatabaseFailoverGroup Triggers failover of a failover group to the secondary server

Note

It's possible to deploy your failover group across subscriptions by using the -PartnerSubscriptionId parameter in Azure Powershell starting with Az.SQL 3.11.0. To learn more, review the following Example.