Migrate from self-hosted Prometheus to Azure Monitor managed service for Prometheus

This article provides guidance for organizations that are planning to migrate from self-managed Prometheus to Azure Monitor managed service for Prometheus. Prometheus is a widely adopted open-source monitoring solution known for its powerful capabilities in collecting, storing, and querying time-series data. You might start with self-managed Prometheus setups, but as your systems scale, the operational overhead of managing Prometheus environments can become significant. Azure Monitor managed service for Prometheus delivers the core benefits of Prometheus along with scalability and reduced maintenance efforts.

Benefits of Azure Monitor managed service for Prometheus

You can use Azure Monitor managed service for Prometheus to use Prometheus functionality while you benefit from Azure cloud-native, enterprise-grade capabilities. Key advantages are:

Key concepts

Metric collection

Use Azure Managed Prometheus in either of the following configurations:

  • Fully managed service or drop-in replacement for self-managed Prometheus: In this case, a managed add-on within your AKS or Azure Arc-enabled Kubernetes cluster collects data. You can use custom resources (pod and service monitors) and the add-on ConfigMaps to configure data collection. The format of the pod/service monitors and ConfigMaps are the same as open-source Prometheus. In this way, you can use existing configs directly with Azure Managed Prometheus.
  • Remote-write target: Use Prometheus remote write to send metrics from your existing Prometheus server running in Azure or non-Azure environments to send data to an Azure Monitor workspace. You can gradually migrate from self-hosted to the fully managed add-on.

Enabling the Azure Managed Prometheus add-on in an AKS cluster deploys the pod monitor and service monitor custom resource definitions (CRDs) so that you can create your own custom resources. Use pod monitors and service monitors to customize scraping targets, similar to the open-source software Prometheus Operator.

Note

Currently, the PrometheusRule CRD isn't supported with Azure Managed Prometheus.

Storage

Prometheus metrics are stored in an Azure Monitor workspace, which is a unique environment for data that Azure Monitor collects. Each workspace has its own data repository, configuration, and permissions. Data is stored for 18 months.

Note

Log Analytics workspaces contain logs and metrics data from multiple Azure resources. Azure Monitor workspaces currently contain only metrics related to Prometheus.

Alerting

Azure Managed Prometheus rule groups provide a managed and scalable way to create and update recording rules and alerts. The rule groups are following on Prometheus rules configuration. You can convert your existing recording rules and alerts to an Azure Managed Prometheus rule group. Prometheus alerts are integrated with other alerts in Azure Monitor.

Visualization

Azure Managed Grafana is a data visualization platform built on top of the Grafana software by Grafana Labs. Azure operates and supports this fully managed Azure service. Whether you're using Azure Managed Grafana or self-hosted Grafana, you can query metrics from an Azure Monitor workspace. When you enable Managed Prometheus for your AKS or Azure Arc-enabled Kubernetes, we provide out-of-the-box dashboards that are the same as the ones used by open-source Prometheus Operator.

Limitations and differences from open-source Prometheus

The following limitations apply to Azure Monitor managed service for Prometheus:

1. Evaluate your current setup

Before you begin the migration, review the following details of your current self-hosted Prometheus stack.

Capacity requirements

An Azure Monitor workspace is highly scalable and can support a large volume of metrics ingestion. You can increase the default limits as your scale requires it.

  • For the managed add-on, the data volume of metrics depends on the size of the AKS cluster and how many workloads you plan to run. You can enable Azure Managed Prometheus on a few clusters to estimate the metrics volume.
  • If you plan to use remote write before you fully migrate to the managed add-on agent, you can determine the metrics ingestion volume based on historical usage. You can also inspect the metric prometheus_remote_storage_samples_in_total to evaluate the metrics volume being sent through remote write.

Installed Prometheus version

The Prometheus version is required if you're using remote write to send data to an Azure Monitor workspace. See Supported versions.

Cost

Pricing is based on metrics ingestion and query volume. For more information on metrics pricing, see Azure Monitor pricing. You can use the Azure pricing calculator to estimate the cost.

More details

Review the following configurations for your self-hosted Prometheus setup. This assessment helps to identify any customizations that require attention during migration.

  • Alerting and recording rules configuration
  • Active data sources and exporters
  • Dashboards

2. Configure Azure Managed Prometheus

There are two methods to configure Azure Managed Prometheus, as described on the following tabs.

To enable the managed Prometheus add-on for your AKS cluster and provision your Azure Monitor workspace, see Enable monitoring for Kubernetes clusters.

3. Configure metrics collection and exporters

There are two methods to configure metrics collection, as described on the following tabs.

  1. Review the default data/metrics collected by the managed add-on at Default Prometheus metrics configuration in Azure Monitor. The predefined targets that you can enable or disable are the same as the targets that are available with the open-source Prometheus operator. The only difference is that the metrics collected by default are the ones queried by the automatically provisioned dashboards. These default metrics are referred to as minimal ingestion profile.

  2. To customize the targets that are scraped by using the add-on, configure the data collection by using the add-on ConfigMap or by using custom resources (pod and service monitors).

    • If you're using pod monitors and service monitors to monitor your workloads, migrate them to Azure Managed Prometheus by changing apiVersion to azmonitoring.coreos.com/v1.
    • The Azure Managed Prometheus add-on ConfigMap follows the same format as open-source Prometheus. If you have an existing Prometheus configuration YAML file, convert them into the ConfigMap add-on. See Create and validate custom configuration file for Prometheus metrics in Azure Monitor.
  3. Review the list of commonly used workloads that have curated configurations and instructions to help you set up metrics collection with Azure Managed Prometheus.

4. Migrate alerts and dashboards

Alerting rules and recording rules

Azure Managed Prometheus supports Prometheus alerting rules and recording rules with Prometheus rule groups. See Convert your existing rules to a Prometheus rule group Azure Resource Manager template.

With the managed add-on, recommended recording rules are automatically set up as you enable Managed Prometheus for your AKS or Azure Arc-enabled cluster. To review the list of automatically provisioned recording rules, see Default Prometheus metrics configuration in Azure Monitor. Prometheus community recommended alerts are also available, and you can create them out-of-the-box.

Dashboards

If you're using Grafana, connect Grafana to Azure Monitor Prometheus metrics. You can reuse existing dashboards by importing them to Grafana. If you're using the Azure Managed Grafana or Azure Monitor dashboards with Grafana, the default or recommended dashboards are automatically set up and provisioned so that you can visualize the metrics. To review the list of automatically provisioned dashboards, see Default Prometheus metrics configuration in Azure Monitor.

5. Test and validate

After your migration is finished, validate that your setup is working as expected:

6. Monitor limits and quotas

Azure Monitor workspaces have default limits and quotas for ingestion. You might experience throttling as you onboard more clusters and reach the ingestion limits. Monitor and alert on the workspace ingestion limits to ensure that you don't reach throttling limits.