Azure Managed Grafana service reliability

This article provides information on availability zone support, disaster recovery, and availability of Azure Managed Grafana for workspaces in the Standard plan. The Essential plan (preview) doesn't offer the same reliability options. We don't recommend that plan for production.

An Azure Managed Grafana workspace in the Standard tier is hosted on a dedicated set of virtual machines. By default, two virtual machines are deployed to provide redundancy. Each virtual machine runs a Grafana server. A network load balancer distributes browser requests among the Grafana servers. On the backend, the Grafana servers are connected to a common database that stores the configuration and other persistent data for an entire Azure Managed Grafana workspace.

Diagram of the Azure Managed Grafana Standard tier workspace setup.

The load balancer keeps track of which Grafana servers are available. In a dual-server setup, if it detects that one server is down, the load balancer sends all requests to the remaining server. That server should be able to pick up the browser sessions previously served by the other one based on information saved in the shared database. In the meantime, the Azure Managed Grafana service works to repair the unhealthy server or bring up a new one.

Microsoft doesn't provide or set up disaster recovery for this service. If there's a region level outage, the service experiences downtime. You can set up workspaces in other regions for disaster recovery purposes.

Next step