Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Applies to:
Azure SQL Managed Instance
This article teaches you how to configure a failover group for Azure SQL Managed Instance by using the Azure portal and Azure PowerShell.
For an end-to-end PowerShell script to create both instances within a failover group, review Add instance to a failover group with PowerShell.
To configure a failover group, you should already have proper permissions and a SQL managed instance you intend to use as the primary. Review Create instance to get started.
Be sure to review the limitations before creating your secondary instance and failover group.
To configure a failover group between a primary and secondary SQL Managed Instance, consider the following requirements:
- The secondary managed instance must be empty, without any user databases.
- The two instances need to be the same service tier, and have the same storage size. While not required, it's strongly recommended that both instances have equal compute sizes to ensure the secondary instance can sustainably process changes replicated from the primary instance, including during periods of peak activity.
- The IP address range for the virtual network of the primary instance must not overlap with the address range of the virtual network for the secondary managed instance, or any other virtual network peered with either the primary or secondary virtual network.
- Both instances must be in the same DNS zone. When you create your secondary managed instance, you must specify the primary instance's DNS zone ID. If you don't, the zone ID is generated as a random string when the first instance is created in each virtual network and the same ID is assigned to all other instances in the same subnet. Once assigned, the DNS zone can't be modified.
- Network Security Groups (NSG) rules for the subnets of both instances must have open inbound and outbound TCP connections for port 5022 and port range 11000-11999 to facilitate communication between the two instances.
- Managed instances should be deployed to paired regions for performance reasons. Managed instances that reside in geo-paired regions benefit from a significantly higher geo-replication speed compared to unpaired regions.
- Both instances must use the same update policy.
When you create the secondary instance, you must use a virtual network that has an IP address space that doesn't overlap with the IP address space range of the primary instance. Additionally, when you configure the new secondary instance, you must specify the zone ID of the primary instance.
You can configure the secondary virtual network, and create the secondary instance by using the Azure portal and PowerShell.
You can create both your virtual network, and your secondary instance by using PowerShell. For a full PowerShell script sample that does both, review Add instance to failover group with PowerShell.
When you create your secondary instance with Azure PowerShell, be sure to specify the DnsZonePartner
value with the zone ID of the primary instance.
You can use the Get-AzSqlInstance command to parameterize the DNS zone ID of the primary instance, and then pass it to the DnsZonePartner
parameter of the New-AzSqlInstance command:
# Get the DNS zone ID of the primary instance
$primaryManagedInstanceId = Get-AzSqlInstance -Name $primaryInstance -ResourceGroupName $resourceGroupName | Select-Object Id
For uninterrupted geo-replication traffic flow, you must establish connectivity between the virtual network subnets that host the primary and secondary instances. There are multiple ways to connect managed instances in different Azure regions, including:
- Global virtual network peering
- Azure ExpressRoute
- VPN gateways
Global virtual network peering is recommended as the most performant and robust way to establish connectivity between instances in a failover group. Global virtual network peering provides a low-latency, high-bandwidth private connection between peered virtual networks using the Microsoft backbone infrastructure. No public internet, gateways, or additional encryption is required in the communication between the peered virtual networks.
Important
Alternative ways of connecting instances that involve additional networking devices can complicate troubleshooting connectivity or replication speed issues, possibly requiring active involvement of network administrators, and potentially significantly prolonging resolution time.
If you use a mechanism to establish connectivity between the instances other than the recommended global virtual network peering, ensure the following:
- Networking device, like firewalls or network virtual appliances (NVAs), don't block traffic on inbound and outbound connections for port 5022 (TCP) and port range 11000-11999.
- Routing is properly configured, and asymmetric routing is avoided.
This article guides you to configure global virtual network peering between the networks of the two instances by using the Azure portal and PowerShell.
Configure global virtual network peering between the virtual networks of the primary and secondary managed instances:
# Create global virtual network peering
$primaryVirtualNetwork = Get-AzVirtualNetwork -Name $primaryVNet -ResourceGroupName $resourceGroupName
$secondaryVirtualNetwork = Get-AzVirtualNetwork -Name $secondaryVNet -ResourceGroupName $resourceGroupName
Write-host "Peering primary VNet to secondary VNet..."
Add-AzVirtualNetworkPeering -Name primaryVnet-secondaryVNet1 -VirtualNetwork $primaryVirtualNetwork -RemoteVirtualNetworkId $secondaryVirtualNetwork.Id
Write-host "Primary VNet peered to secondary VNet successfully."
Write-host "Peering secondary VNet to primary VNet..."
Add-AzVirtualNetworkPeering -Name secondaryVNet-primaryVNet -VirtualNetwork $secondaryVirtualNetwork -RemoteVirtualNetworkId $primaryVirtualNetwork.Id
Write-host "Secondary VNet peered to primary VNet successfully."
Write-host "Checking peering state on the primary virtual network..."
Get-AzVirtualNetworkPeering -ResourceGroupName $resourceGroupName -VirtualNetworkName $primaryVNet | Select PeeringState
Write-host "Checking peering state on the secondary virtual network..."
Get-AzVirtualNetworkPeering -ResourceGroupName $resourceGroupName -VirtualNetworkName $secondaryVNet | Select PeeringState
This code snippet uses the following commands:
Command | Notes |
---|---|
1. Get-AzVirtualNetwork | Gets a virtual network in a resource group. |
2. Add-AzVirtualNetworkPeering | Adds a peering to a virtual network. |
3. Get-AzVirtualNetworkPeering | Gets a peering for a virtual network. |
Regardless of the chosen connectivity mechanism between the two instances, your networks must meet the following requirements for the flow of geo-replication traffic:
- The route table and network security groups assigned to the managed instance subnets aren't shared across the two peered virtual networks.
- The Network Security Group (NSG) rules on both subnets that host each instance allows both inbound and outbound traffic to the other instance on port 5022, and the port range 11000-11999.
You can configure your port communication and NSG rules by using the Azure portal and PowerShell.
You can configure the NSG rules for both virtual networks by using the Add-AzNetworkSecurityRuleConfig PowerShell command to modify the existing network security group. For a full PowerShell script sample that configures both networks, review Add instance to failover group with PowerShell.
Create the failover group for your managed instances by using the Azure portal or PowerShell.
Use New-AzSqlDatabaseInstanceFailoverGroup to create the failover group for your managed instances by using PowerShell.
# Create failover group
Write-host "Creating the failover group..."
$failoverGroup = New-AzSqlDatabaseInstanceFailoverGroup -Name $failoverGroupName -Location $location -ResourceGroupName $resourceGroupName -PrimaryManagedInstanceName $primaryInstance -PartnerRegion $drLocation -PartnerManagedInstanceName $secondaryInstance -FailoverPolicy Manual -GracePeriodWithDataLossHours 1
$failoverGroup
Test failover of your failover group by using the Azure portal or PowerShell.
Note
If the instances are in different subscriptions, or resource groups, initiate the failover from the secondary instance.
Test planned failover by failing over to the secondary replica with PowerShell, and then failing back.
Use the Get-AzSqlDatabaseInstanceFailoverGroup command to confirm the roles of each instance:
# Verify the current primary role
Get-AzSqlDatabaseInstanceFailoverGroup -ResourceGroupName $resourceGroupName -Location $location -Name $failoverGroupName
Use the Switch-AzSqlDatabaseFailoverGroup to fail over to the secondary instance.
# Failover the primary managed instance to the secondary role
Write-host "Failing primary over to the secondary location"
Get-AzSqlDatabaseInstanceFailoverGroup -ResourceGroupName $resourceGroupName -Location $drLocation -Name $failoverGroupName | Switch-AzSqlDatabaseInstanceFailoverGroup
Write-host "Successfully failed failover group to secondary location"
# Verify the current primary role
Get-AzSqlDatabaseInstanceFailoverGroup -ResourceGroupName $resourceGroupName -Location $drLocation -Name $failoverGroupName
Use the Get-AzSqlDatabaseInstanceFailoverGroup command to confirm the roles of each instance:
# Verify the current primary role
Get-AzSqlDatabaseInstanceFailoverGroup -ResourceGroupName $resourceGroupName -Location $location -Name $failoverGroupName
Optionally, use the Switch-AzSqlDatabaseFailoverGroup command to fail back to the primary instance.
# Fail primary managed instance back to primary role
Write-host "Failing primary back to primary role"
Get-AzSqlDatabaseInstanceFailoverGroup -ResourceGroupName $resourceGroupName -Location $location -Name $failoverGroupName | Switch-AzSqlDatabaseInstanceFailoverGroup
Write-host "Successfully failed failover group to primary location"
# Verify the current primary role
Get-AzSqlDatabaseInstanceFailoverGroup -ResourceGroupName $resourceGroupName -Location $location -Name $failoverGroupName
You can modify an existing failover group, such as to change the failover policy, by using the Azure portal, PowerShell, Azure CLI and the REST APIs.
To modify an existing failover group by using the Azure portal, follow these steps:
Go to your SQL managed instance in the Azure portal.
Under Data management, select Failover groups to open the Failover groups pane.
On the Failover groups pane, select Edit configurations from the command bar to open the Edit failover group pane:
After your failover group is configured, update the connection string for your application to point to the Read/write listener endpoint so that your application continues to connect to whichever instance is primary after failover. By using the listener endpoint, you don't have to manually update your connection string every time your failover group fails over since traffic is always routed to the current primary. You can also point read-only workload to the Read-only listener endpoint.
Important
While connecting to an instance in a failover group using the instance-specific connection string is supported, it's strongly discouraged. Use the listener endpoints instead.
To locate the listener endpoint in the Azure portal, go to your SQL managed instance and under Data management, select Failover groups.
Scroll down to find the listener endpoints:
- The Read/write listener endpoint, in the form of
fog-name.dns-zone.database.chinacloudapi.cn
, routes traffic to the primary instance. - The Read-only listener endpoint, in the form of
fog-name.secondary.dns-zone.database.chinacloudapi.cn
, routes traffic to the secondary instance.
You can create a failover group between SQL Managed Instances in two different subscriptions, as long as subscriptions are associated to the same Microsoft Entra tenant.
- When using PowerShell API, you can do it by specifying the
PartnerSubscriptionId
parameter for the secondary SQL Managed Instance. - When using REST API, each instance ID included in the
properties.managedInstancePairs
parameter can have its own Subscription ID. - Azure portal doesn't support creating failover groups across different subscriptions.
Important
Azure portal does not support creation of failover groups across different subscriptions. For failover groups across different subscriptions and/or resource groups, failover can't be initiated manually via the Azure portal from the primary SQL managed instance. Initiate it from the geo-secondary instance instead.
Due to the high latency of wide area networks, geo-replication uses an asynchronous replication mechanism. Asynchronous replication makes the possibility of data loss unavoidable if the primary fails. To protect critical transactions from data loss, an application developer can call the sp_wait_for_database_copy_sync stored procedure immediately after committing the transaction. Calling sp_wait_for_database_copy_sync
blocks the calling thread until the last committed transaction has been transmitted and hardened in the transaction log of the secondary database. However, it doesn't wait for the transmitted transactions to be replayed (redone) on the secondary. sp_wait_for_database_copy_sync
is scoped to a specific geo-replication link. Any user with the connection rights to the primary database can call this procedure.
Note
sp_wait_for_database_copy_sync
prevents data loss after geo-failover for specific transactions, but does not guarantee full synchronization for read access. The delay caused by a sp_wait_for_database_copy_sync
procedure call can be significant and depends on the size of the not yet transmitted transaction log on the primary at the time of the call.
Let's assume that instance A is the primary instance, instance B is the existing secondary instance, and instance C is the new secondary instance in the third region. To make the transition, follow these steps:
- Create instance C with same size as A and in the same DNS zone.
- Delete the failover group between instances A and B. At this point, attempts to sign in start to fail because the SQL aliases for the failover group listeners have been deleted and the gateway won't recognize the failover group name. The secondary databases are disconnected from the primaries and become read-write databases.
- Create a failover group with the same name between instance A and C. Follow the instructions in configure failover group guide. This is a size-of-data operation and completes when all databases from instance A are seeded and synchronized.
- Delete instance B if not needed to avoid unnecessary charges.
Note
After step 2 and until step 3 is completed the databases in instance A will remain unprotected from a catastrophic failure of instance A.
Let's assume instance A is the primary instance, instance B is the existing secondary instance, and instance C is the new primary instance in the third region. To make the transition, follow these steps:
- Create instance C with same size as B and in the same DNS zone.
- Initiate a manual failover from instance B to make it the new primary. Instance A becomes the new secondary instance automatically.
- Delete the failover group between instances A and B. At this point, sign in attempts using failover group endpoints start to fail. The secondary databases on A are disconnected from the primaries and become read-write databases.
- Create a failover group with the same name between instance B and C. This is a size-of-data operation and completes when all databases from instance B are seeded and synchronized with instance C. At this point, sign in attempts stop failing.
- Manually fail over to switch the C instance to the primary role. Instance B becomes the new secondary instance automatically.
- Delete instance A if not needed to avoid unnecessary charges.
Caution
After step 3 and until step 4 is completed, the databases in instance A will remain unprotected from a catastrophic failure of instance A.
Important
When the failover group is deleted, the DNS records for the listener endpoints are also deleted. At that point, there's a non-zero probability of somebody else creating a failover group with the same name. Because failover group names must be globally unique, this will prevent you from using the same name again. To minimize this risk, don't use generic failover group names.
System databases are not replicated to the secondary instance in a failover group. To enable scenarios that depend on objects from the system databases, make sure to create the same objects on the secondary instance and keep them synchronized with the primary instance.
For example, if you plan to use the same logins on the secondary instance, make sure to create them with the identical SID.
-- Code to create login on the secondary instance
CREATE LOGIN foo WITH PASSWORD = '<enterStrongPasswordHere>', SID = <login_sid>;
To learn more, see Replication of logins and agent jobs.
Instances in a failover group remain separate Azure resources, and no changes made to the configuration of the primary instance will be automatically replicated to the secondary instance. Make sure to perform all relevant changes both on primary and secondary instance. For example, if you change backup storage redundancy or long-term backup retention policy on primary instance, make sure to change it on secondary instance as well.
You can scale the primary and secondary instance up or down to a different compute size within the same service tier or to a different service tier. When scaling up within the same service tier, first scale up the geo-secondary first, and then scale up the primary. When scaling down within the same service tier, reverse the order: scale down the primary first, and then scale down the secondary. Follow the same sequence when you scale an instance to a different service tier.
This sequence is recommended to avoid problems from the geo-secondary, at a lower SKU, getting overloaded and having to reseed during an upgrade or downgrade process.
Permissions for a failover group are managed via Azure role-based access control (Azure RBAC).
The SQL Managed Instance Contributor role, scoped to the resource groups of the primary and the secondary managed instance, is sufficient to perform all management operations on failover groups.
The following table provides granular view of minimal required permissions and their respective minimal required scope levels for management operations on failover groups:
Management operation | Permission | Scope |
---|---|---|
Create/Update failover group | Microsoft.Sql/locations/instanceFailoverGroups/write |
Resource groups of primary and secondary managed instance |
Create/Update failover group | Microsoft.Sql/managedInstances/write |
Primary and secondary managed instance |
Failover failover group | Microsoft.Sql/locations/instanceFailoverGroups/failover/action |
Resource groups of primary and secondary managed instance |
Force failover failover group | Microsoft.Sql/locations/instanceFailoverGroups/forceFailoverAllowDataLoss/action |
Resource groups of primary and secondary managed instance |
Delete failover group | Microsoft.Sql/locations/instanceFailoverGroups/delete |
Resource groups of primary and secondary managed instance |
When creating a new failover group, consider the following limitations:
- Failover groups can't be created between two instances in the same Azure region.
- An instance can participate only in one failover group at any moment.
- A failover group can't be created between two instances that belong to different Azure tenants.
- Creating a failover group between two instances in different resource groups or subscriptions is only supported with Azure PowerShell, or the REST API, and not the Azure portal or the Azure CLI. Once the failover group is created, it's visible in the Azure portal, and all operations are supported in the Azure portal or with the Azure CLI. Failover must be initiated from the secondary instance.
- If initial seeding of all databases doesn't complete within 7 days, creating the failover group fails and all successfully replicated databases are deleted from the secondary instance.
- Creating a failover group with an instance configured with a managed instance link is currently unsupported.
- Databases migrated to Azure SQL Managed Instance by using the Log Replay Service (LRS) can't be added to a failover group until the cutover step is executed.
When using failover groups, consider the following limitations:
- Failover groups can't be renamed. You'll need to delete the group and re-create it with a different name.
- A failover group contains exactly two managed instances. Adding additional instances to the failover group is unsupported.
- Database rename isn't supported for databases in failover group. You'll need to temporarily delete the failover group to be able to rename a database.
- System databases aren't replicated to the secondary instance in a failover group. Therefore, scenarios that depend on objects from the system databases such as Server Logins and Agent jobs, require objects to be manually created on the secondary instances and also manually kept in sync after any changes made on primary instance. The only exception is Service master Key (SMK) for SQL Managed Instance that is replicated automatically to secondary instance during creation of failover group. Any subsequent changes of SMK on the primary instance however won't be replicated to secondary instance. To learn more, see how to Enable scenarios dependent on objects from the system databases.
Failover groups can also be managed programmatically using Azure PowerShell, Azure CLI, and REST API. The following tables describe the set of commands available. Failover groups include a set of Azure Resource Manager APIs for management, including the Azure SQL Database REST API and Azure PowerShell cmdlets. These APIs require the use of resource groups and support Azure role-based access control (Azure RBAC). For more information on how to implement access roles, see Azure role-based access control (Azure RBAC).
Cmdlet | Description |
---|---|
New-AzSqlDatabaseInstanceFailoverGroup | This command creates a failover group and registers it on both primary and secondary instances |
Set-AzSqlDatabaseInstanceFailoverGroup | Modifies configuration of a failover group |
Get-AzSqlDatabaseInstanceFailoverGroup | Retrieves a failover group's configuration |
Switch-AzSqlDatabaseInstanceFailoverGroup | Triggers failover of a failover group to the secondary instance |
Remove-AzSqlDatabaseInstanceFailoverGroup | Removes a failover group |