Reliability guides by service

This article provides links to reliability guidance for many Azure services. Most reliability guides contain the following information:

  • Reliability architecture overview is a synopsis of how the service supports reliability, including information about which components are managed by Azure and which are managed by you, any built-in redundancy features, and how to provision and manage multiple resources, if applicable.
  • Transient fault handling details how the service handles normal day-to-day transient faults that can occur in the cloud and include information on how to handle these faults in your application. This includes information on retry policies, timeouts, and other best practices for handling transient faults.
  • Availability zones such as zonal and zone-redundant deployment options, traffic routing and data replication between zones, what happens if a zone experiences an outage, failback, and how to configure your resources for availability zone support.
  • Multi-region support such as how to configure multi-region or geo-disaster support, traffic routing and data replication between regions, region-down experience, failover and failback support, alternative multi-region support.

Some guides also contain information on:

  • Backup support such as who controls backups, where they are stored and replicated to, how they can be recovered, and whether they are accessible only within a region or across regions.
  • Service level agreements for availability, including how the expected uptime changes based on the configuration you use.

Reliability guides by service

This section provides links to reliability guidance for many Azure services. Each service guide contains information on how the service supports reliability features.

Note

Some service documents are in the process of, or are not yet updated into a single reliability guide format. These may contain more than one document that references reliability guidance.

Product Reliability Guide Other Reliability Documentation
Azure AI Health Insights Reliability in Azure AI Health Insights
Azure AI Search Reliability in Azure AI Search
Azure App Configuration How does App Configuration ensure high data availability?

Resiliency and disaster recovery
Azure Application Gateway (V2) Autoscaling and High Availability
Azure App Service Reliability in Azure App Service
Azure Backup Reliability in Azure Backup
Azure Batch Reliability in Azure Batch
Azure Cache for Redis Enable zone redundancy for Azure Cache for Redis

Configure passive geo-replication for Premium Azure Cache for Redis instances
Azure Container Apps Reliability in Azure Container Apps
Azure Container Registry Enable zone redundancy in Azure Container Registry for resiliency and high availability

Geo-replication in Azure Container Registry
Azure Cosmos DB for NoSQL Reliability in Azure Cosmos DB for NoSQL
Azure Databox How can I recover my data if an entire region fails?
Azure Data Explorer Business continuity and disaster recovery overview
Azure Database for MySQL Overview of business continuity with Azure Database for MySQL - Single Server
Azure Database for MySQL - Flexible Server Azure Database for MySQL Flexible Server High availability

Azure Database for MySQL Flexible Server - Restore to latest restore point
Azure Database for PostgreSQL - Flexible Server Reliability in Azure Database for PostgreSQL - Flexible Server
Azure DevOps Data availability
Azure Disk Encryption Redundancy options for managed disks
Azure Disks Best practices for achieving high availability with Azure virtual machines and managed disks
Azure DNS Reliability in Azure DNS
Azure Event Hubs Reliability in Azure Event Hubs
Azure ExpressRoute Designing for high availability with ExpressRoute

Designing for disaster recovery with ExpressRoute private peering
Azure Firewall Deploy an Azure Firewall with Availability Zones using Azure PowerShell
Azure Files Choose the right redundancy option

Disaster recovery and failover for Azure Files
Azure Functions Reliability in Azure Functions
Azure IoT Hub IoT Hub high availability and disaster recovery
Azure Key Vault Azure Key Vault availability and redundancy
Azure Kubernetes Service (AKS) Reliability in Azure Kubernetes Service (AKS)
Azure Load Balancer Reliability in Azure Load Balancer
Azure Logic Apps Reliability in Azure Logic Apps
Azure Machine Learning Service Failover for business continuity and disaster recovery
Azure Media Services High Availability with Media Services and Video on Demand (VOD)
Azure Migrate Does Azure Migrate offer Backup and Disaster Recovery?
Azure Monitor Logs Enhance data and service resilience in Azure Monitor Logs with availability zones
Azure Network Watcher Azure Network Watcher service availability and redundancy

|Azure Private Link|| Azure Private Link availability | |Azure Public IP|| Azure Public IP Availability Zone | |Azure Route Server|| Azure Route Server frequently asked questions (FAQ)| |Azure Service Bus|| Best practices for insulating applications against Service Bus outages and disasters| |Azure Service Fabric|| Deploy an Azure Service Fabric cluster across Availability Zones

Disaster recovery in Azure Service Fabric | |Azure SignalR Service|| Resiliency and disaster recovery in Azure SignalR Service| |Azure Site Recovery|| Set up disaster recovery for Azure VMs| |Azure SQL Database||Azure SQL Database - High availability

Disaster recovery guidance - Azure SQL Database | |Azure SQL Managed Instance|| Failover groups overview & best practices - Azure SQL Managed Instance | |Azure Storage - Blob Storage||Choose the right redundancy option

Azure storage disaster recovery planning and failover| |Azure Stream Analytics|| Achieve geo-redundancy for Azure Stream Analytics jobs | |Azure Traffic Manager| Reliability in Azure Traffic Manager|| |Azure Virtual Machines| Reliability in Azure Virtual Machines|| |Azure Virtual Machine Image Builder| Reliability in Azure Virtual Machine Image Builder|| |Azure Virtual Machine Scale Sets| Reliability in Azure Virtual Machine Scale Sets|| |Azure Virtual Network|| Virtual networks and availability zones

Virtual Network - Business Continuity | |Azure Virtual WAN||How are Availability Zones and resiliency handled in Virtual WAN?

Disaster recovery design | |Azure VPN Gateway|| About zone-redundant virtual network gateway in Azure availability zones

Highly Available cross-premises and VNet-to-VNet connectivity | |Azure Web Application Firewall| | Deploy an Azure Firewall with Availability Zones using Azure PowerShell

How do I achieve a disaster recovery scenario across datacenters by using Application Gateway?| |Microsoft Purview| Reliability in Microsoft Purview||