Azure Managed Grafana service reliability

This article provides information on availability zone support, disaster recovery and availability of Azure Managed Grafana for instances in the Standard plan. The Essential plan (preview) doesn't offer the same reliability and isn't recommended for use in production.

An Azure Managed Grafana instance in the Standard tier is hosted on a dedicated set of virtual machines (VMs). By default, two VMs are deployed to provide redundancy. Each VM runs a Grafana server. A network load balancer distributes browser requests amongst the Grafana servers. On the backend, the Grafana servers are connected to a common database that stores the configuration and other persistent data for an entire Managed Grafana instance.

Diagram of the Managed Grafana Standard tier instance setup.

The load balancer always keeps track of which Grafana servers are available. In a dual-server setup, if it detects that one server is down, the load balancer starts sending all requests to the remaining server. That server should be able to pick up the browser sessions previously served by the other one based on information saved in the shared database. In the meantime, the Managed Grafana service will work to repair the unhealthy server or bring up a new one.

Microsoft is not providing or setting up disaster recovery for this service. In case of a region level outage, service will experience downtime and users can set up additional instances in other regions for disaster recovery purposes.

For a complete list of regions where Managed Grafana is available, see Products available by region - Azure Managed Grafana

Next steps