还原 Azure SQL 数据库或故障转移到辅助数据库Restore an Azure SQL Database or failover to a secondary

Azure SQL 数据库提供以下功能,以便在服务中断后进行恢复:Azure SQL Database offers the following capabilities for recovering from an outage:

若要了解业务连续性方案以及支持这些方案的功能,请参阅业务连续性To learn about business continuity scenarios and the features supporting these scenarios, see Business continuity.

备注

如果使用区域冗余高级或业务关键数据库或池,将自动执行恢复过程,此材料的其余部分将不适用。If you are using zone-redundant Premium or Business Critical databases or pools, the recovery process is automated and the rest of this material does not apply.

备注

主数据库和辅助数据库都需要有相同的服务层级。Both primary and secondary databases are required to have the same service tier. 另外,强烈建议创建与主数据库具有相同计算大小(DTU 或 vCore)的辅助数据库。It is also strongly recommended that the secondary database is created with the same compute size (DTUs or vCores) as the primary. 有关详细信息,请参阅作为主数据库进行升级或降级For more information, see Upgrading or downgrading as primary database.

备注

使用一个或多个故障转移组来管理多个数据库的故障转移。Use one or several failover groups to manage failover of multiple databases. 如果将现有的异地复制关系添加到故障转移组,请确保使用与主数据库相同的服务层级和计算大小来配置异地辅助数据库。If you add an existing geo-replication relationship to the failover group, make sure the geo-secondary is configured with the same service tier and compute size as the primary. 有关详细信息,请参阅使用自动故障转移组可以实现多个数据库的透明、协调式故障转移For more information, see Use auto-failover groups to enable transparent and coordinated failover of multiple databases.

准备好应对中断事件Prepare for the event of an outage

为了使用故障转移组或异地冗余备份成功恢复到其他数据区域,需要为下一次数据中心服务中断准备服务器,以便在需要时使其成为新的主服务器,还需要记录、测试各项明确定义的步骤,确保顺利恢复数据。For success with recovery to another data region using either failover groups or geo-redundant backups, you need to prepare a server in another data center outage to become the new primary server should the need arise as well as have well-defined steps documented and tested to ensure a smooth recovery. 准备步骤包括:These preparation steps include:

  • 标识其他区域中要成为新的主服务器的 SQL 数据库服务器。Identify the SQL Database server in another region to become the new primary server. 对于异地还原,此服务器通常位于数据库所在区域的配对区域(中国东部、中国东部 2、中国北部、中国北部 2)中。For geo-restore, this is generally a server in the paired region (China East、China East 2、China North、China North 2) for the region in which your database is located. 这会在异地还原操作期间消除额外的流量成本。This eliminates the additional traffic cost during the geo-restoring operations.
  • 标识(并选择性定义)用户访问新的主数据库时所需的服务器级 IP 防火墙规则。Identify, and optionally define, the server-level IP firewall rules needed on for users to access the new primary database.
  • 确定要如何将用户重定向到新的主服务器,例如通过更改连接字符串或更改 DNS 条目。Determine how you are going to redirect users to the new primary server, such as by changing connection strings or by changing DNS entries.
  • 标识(并选择性创建)新主服务器的 master 数据库中必须存在的登录信息,并确保这些登录信息在 master 数据库中具有相应权限(若有)。Identify, and optionally create, the logins that must be present in the master database on the new primary server, and ensure these logins have appropriate permissions in the master database, if any. 有关详细信息,请参阅灾难恢复后的 SQL 数据库安全性For more information, see SQL Database security after disaster recovery
  • 需要更新标识才可映射到新的主数据库的警报规则。Identify alert rules that need to be updated to map to the new primary database.
  • 记录当前主数据库上的审核配置Document the auditing configuration on the current primary database
  • 执行灾难恢复演练Perform a disaster recovery drill. 若要模拟中断情况进行异地还原,可删除或重命名源数据库以引发应用程序连接失败。To simulate an outage for geo-restore, you can delete or rename the source database to cause application connectivity failure. 若要使用故障转移组来模拟服务中断,可禁用连接到数据库的 Web 应用程序或虚拟机,或者故障转移数据库以引发应用程序连接失败。To simulate an outage using failover groups, you can disable the web application or virtual machine connected to the database or failover the database to cause application connectivity failures.

何时启动恢复When to initiate recovery

恢复操作会影响应用程序。The recovery operation impacts the application. 需更改 SQL 连接字符串或使用 DNS 重定向,可能导致参数数据丢失。It requires changing the SQL connection string or redirection using DNS and could result in permanent data loss. 因此,仅当中断的持续时间可能超过应用程序的恢复时间目标时,才应执行此操作。Therefore, it should be done only when the outage is likely to last longer than your application's recovery time objective. 如果应用程序已部署到生产环境,则应定期监视应用程序的运行状况,并使用以下数据点来声明有必要进行恢复:When the application is deployed to production you should perform regular monitoring of the application health and use the following data points to assert that the recovery is warranted:

  1. 应用程序层与数据库之间的连接发生永久性故障。Permanent connectivity failure from the application tier to the database.
  2. Azure 门户显示了警报,指出区域中的某个事件造成广泛影响。The Azure portal shows an alert about an incident in the region with broad impact.

备注

如果使用故障转移组并选择自动故障转移,则恢复过程是自动完成的,且对应用程序来说是透明的。If you are using failover groups and chose automatic failover, the recovery process is automated and transparent to the application.

根据应用程序的停机容忍度和可能的业务责任,可以考虑下列恢复选项。Depending on your application tolerance to downtime and possible business liability you can consider the following recovery options.

使用获取可恢复数据库 (LastAvailableBackupDate) 获取最新的异地复制还原点。Use the Get Recoverable Database (LastAvailableBackupDate) to get the latest Geo-replicated restore point.

等待服务恢复Wait for service recovery

Azure 团队会努力尽快还原服务可用性,但视根本原因而定,有可能需要数小时或数天的时间。The Azure teams work diligently to restore service availability as quickly as possible but depending on the root cause it can take hours or days. 如果应用程序可以容忍长时间停机,则可以等待恢复完成。If your application can tolerate significant downtime you can simply wait for the recovery to complete. 在此情况下,不需要采取任何操作。In this case, no action on your part is required. 可在 Azure 服务运行状况仪表板上查看当前服务状态。You can see the current service status on our Azure Service Health Dashboard. 在区域恢复后,会还原应用程序的可用性。After the recovery of the region, your application's availability is restored.

故障转移到故障转移组中异地复制的辅助服务器Fail over to geo-replicated secondary server in the failover group

如果应用程序停机可能会带来业务责任,则应使用故障转移组。If your application's downtime can result in business liability, you should be using failover groups. 这样,应用程序在发生中断时,就可以快速还原其他区域的可用性。It enables the application to quickly restore availability in a different region in case of an outage. 有关教程,请参阅实现地理分散的数据库For a tutorial, see Implement a geo-distributed database.

若要还原数据库的可用性,必须使用其中一种受支持的方法,启动到辅助服务器的故障转移。To restore availability of the database(s) you need to initiate the failover to the secondary server using one of the supported methods.

请参考下列指南之一,故障转移到异地复制的辅助数据库:Use one of the following guides to fail over to a geo-replicated secondary database:

使用异地还原进行恢复Recover using geo-restore

如果应用程序停机不会带来业务责任,则可以使用异地还原作为恢复应用程序数据库的方法。If your application's downtime does not result in business liability you can use geo-restore as a method to recover your application database(s). 它会从最新的异地冗余备份创建数据库的副本。It creates a copy of the database from its latest geo-redundant backup.

恢复后配置数据库Configure your database after recovery

服务中断后,如果使用异地还原进行恢复,则必须确保已正确配置与新数据库的连接,以便恢复正常的应用程序功能。If you are using geo-restore to recover from an outage, you must make sure that the connectivity to the new databases is properly configured so that the normal application function can be resumed. 以下任务清单用于让恢复的数据库做好生产准备。This is a checklist of tasks to get your recovered database production ready.

更新连接字符串Update connection strings

因为恢复的数据库将位于不同的服务器中,所以必须更新应用程序的连接字符串,使之指向该服务器。Because your recovered database resides in a different server, you need to update your application's connection string to point to that server.

若要深入了解如何更改连接字符串,请参阅连接库的相应开发语言。For more information about changing connection strings, see the appropriate development language for your connection library.

配置防火墙规则Configure Firewall Rules

需确保服务器和数据库上配置的防火墙规则与主服务器和主数据库上配置的防火墙规则匹配。You need to make sure that the firewall rules configured on server and on the database match those that were configured on the primary server and primary database. 有关更多信息,请参阅如何:配置防火墙设置(Azure SQL 数据库)For more information, see How to: Configure Firewall Settings (Azure SQL Database).

配置登录名和数据库用户Configure logins and database users

需确保应用程序使用的所有登录名都存在于托管已恢复数据库的服务器上。You need to make sure that all the logins used by your application exist on the server which is hosting your recovered database. 有关详细信息,请参阅异地复制的安全性配置For more information, see Security Configuration for geo-replication.

备注

应在灾难恢复演练期间配置并测试服务器防火墙规则和登录(及其权限)。You should configure and test your server firewall rules and logins (and their permissions) during a disaster recovery drill. 服务中断期间,这些服务器级对象及其配置可能不可用。These server-level objects and their configuration may not be available during the outage.

设置遥测警报Setup telemetry alerts

需确保更新现有的警报规则设置,以便映射到恢复的数据库和不同的服务器。You need to make sure your existing alert rule settings are updated to map to the recovered database and the different server.

有关数据库警报规则的详细信息,请参阅接收警报通知跟踪服务运行状况For more information about database alert rules, see Receive Alert Notifications and Track Service Health.

启用审核Enable auditing

如果需要通过审核来访问数据库,则需要在恢复数据库后启用审核。If auditing is required to access your database, you need to enable Auditing after the database recovery. 有关详细信息,请参阅数据库审核For more information, see Database auditing.

后续步骤Next steps