Azure 事件网格中的服务器端异地灾难恢复Server-side geo disaster recovery in Azure Event Grid

现在,事件网格不仅可以针对新的,而且还能针对现有的所有域、主题和事件订阅提供元数据的自动异地灾难恢复 (GeoDR)。Event Grid now has an automatic geo disaster recovery (GeoDR) of meta-data not only for new, but all existing domains, topics, and event subscriptions. 在整个 Azure 区域出现故障时,事件网格已将所有与事件相关的基础结构元数据同步到配对的区域。If an entire Azure region goes down, Event Grid will already have all of your event-related infrastructure metadata synced to a paired region. 无需你的干预,新事件就能再次开始流动。Your new events will begin to flow again with no intervention by you.

根据两项指标衡量灾难恢复:Disaster recovery is measured with two metrics:

  • 恢复点目标 (RPO):可以丢失数据的分钟数或小时数。Recovery Point Objective (RPO): the minutes or hours of data that may be lost.
  • 恢复时间目标 (RTO):服务可以关闭的分钟数或小时数。Recovery Time Objective (RTO): the minutes or hours the service may be down.

事件网格的自动故障转移针对元数据(事件订阅等)和数据(事件)提供不同的 RPO 和 RTO。Event Grid’s automatic failover has different RPOs and RTOs for your metadata (event subscriptions etc.) and data (events). 如果所需的规范不同于下述规范,仍可以使用主题运行状况 API 实现你自己的客户端故障转移If you need different specification from the following ones, you can still implement your own client-side fail over using the topic health apis.

恢复点目标 (RPO)Recovery point objective (RPO)

  • 元数据 RPO:0 分钟。Metadata RPO: zero minutes. 每当在事件网格中创建某个资源时,该资源会立即跨区域复制。Anytime a resource is created in Event Grid, it's instantly replicated across regions. 发生故障转移时,不会丢失任何元数据。When a failover occurs, no metadata is lost.
  • 数据 RPO:如果系统正常,并且在发生区域故障转移时能够跟上现有流量的进度,则事件 RPO 大约为 5 分钟。Data RPO: If your system is healthy and caught up on existing traffic at the time of regional failover, the RPO for events is about 5 minutes.

恢复时间目标 (RTO)Recovery time objective (RTO)

  • 元数据 RTO:事件网格在 60 分钟内即会开始接受对主题和订阅发出的创建/更新/删除调用,不过,此间隔通常要短得多。Metadata RTO: Though generally it happens much more quickly, within 60 minutes, Event Grid will begin to accept create/update/delete calls for topics and subscriptions.
  • 数据 RTO:与元数据类似,其发生速度通常要快得多,不过,在发生区域性故障转移后,事件网格在 60 分钟内即会开始接受新流量。Data RTO: Like metadata, it generally happens much more quickly, however within 60 minutes, Event Grid will begin accepting new traffic after a regional failover.


事件网格中元数据 GeoDR 的成本为:$0。The cost for metadata GeoDR on Event Grid is: $0.

后续步骤Next steps

若要实现自己的客户端故障转移逻辑,请参阅 # 在事件网格为自定义主题构建自己的灾难恢复方案If you want to implement you own client-side failover logic, see # Build your own disaster recovery for custom topics in Event Grid