Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
As described in Kubernetes monitoring in Azure Monitor, multiple features of Azure Monitor work together to provide complete monitoring of your Azure Kubernetes Service (AKS) clusters. This article describes how to enable the following features for AKS clusters:
- Prometheus metrics
- Managed Grafana
- Container logging
- Control plane logs
Prerequisites
- You require at least Contributor access to the cluster for onboarding.
- You require Monitoring Reader or Monitoring Contributor to view data after monitoring is enabled.
Create workspaces
The following table describes the workspaces that are required to support the Azure Monitor features enabled in this article. If you don't already have an existing workspace of each type, you can create them as part of the onboarding process. See Design a Log Analytics workspace architecture for guidance on how many workspaces to create and where they should be placed.
Feature | Workspace | Notes |
---|---|---|
Managed Prometheus | Azure Monitor workspace | If you don't specify an existing Azure Monitor workspace when onboarding, the default workspace for the resource group will be used. If a default workspace doesn't already exist in the cluster's region, one with a name in the format DefaultAzureMonitorWorkspace-<mapped_region> will be created in a resource group with the name DefaultRG-<cluster_region> .Contributor permission is enough for enabling the addon to send data to the Azure Monitor workspace. You will need Owner level permission to link your Azure Monitor Workspace to view metrics in Azure Managed Grafana. This is required because the user executing the onboarding step, needs to be able to give the Azure Managed Grafana System Identity Monitoring Reader role on the Azure Monitor Workspace to query the metrics. |
Container logging Control plane logs |
Log Analytics workspace | You can attach a cluster to a Log Analytics workspace in a different Azure subscription in the same Microsoft Entra tenant, but you must use the Azure CLI or an Azure Resource Manager template. You can't currently perform this configuration with the Azure portal. If you're connecting an existing cluster to a Log Analytics workspace in another subscription, the Microsoft.ContainerService resource provider must be registered in the subscription with the Log Analytics workspace. For more information, see Register resource provider. If you don't specify an existing Log Analytics workspace, the default workspace for the resource group will be used. If a default workspace doesn't already exist in the cluster's region, one will be created with a name in the format DefaultWorkspace-<GUID>-<Region> . |
Managed Grafana | Azure Managed Grafana workspace | Link your Grafana workspace to your Azure Monitor workspace to make the Prometheus metrics collected from your cluster available to Grafana dashboards. |
Enable Prometheus metrics and container logging
When you enable Prometheus and container logging on a cluster, a containerized version of the Azure Monitor agent is installed in the cluster. You can configure these features at the same time on a new or existing cluster, or enable each feature separately.
Prerequisites
- The cluster must use managed identity authentication.
- The following resource providers must be registered in the subscription of the cluster and the Azure Monitor workspace:
- Microsoft.ContainerService
- Microsoft.Insights
- Microsoft.AlertsManagement
- Microsoft.Monitor
- The following resource providers must be registered in the subscription of the Grafana workspace subscription:
- Microsoft.Dashboard
Prerequisites
- Managed identity authentication is default in CLI version 2.49.0 or higher.
- The aks-preview extension must be uninstalled from AKS clusters by using the command
az extension remove --name aks-preview
.
Prometheus metrics
Use the -enable-azure-monitor-metrics
option with az aks create or az aks update depending whether you're creating a new cluster or updating an existing cluster to install the metrics add-on that scrapes Prometheus metrics. This will use the configuration described in Default Prometheus metrics configuration in Azure Monitor. To modify this configuration, see Customize scraping of Prometheus metrics in Azure Monitor managed service for Prometheus.
See the following examples.
### Use default Azure Monitor workspace
az aks create/update --enable-azure-monitor-metrics --name <cluster-name> --resource-group <cluster-resource-group>
### Use existing Azure Monitor workspace
az aks create/update --enable-azure-monitor-metrics --name <cluster-name> --resource-group <cluster-resource-group> --azure-monitor-workspace-resource-id <workspace-name-resource-id>
### Use an existing Azure Monitor workspace and link with an existing Grafana workspace
az aks create/update --enable-azure-monitor-metrics --name <cluster-name> --resource-group <cluster-resource-group> --azure-monitor-workspace-resource-id <azure-monitor-workspace-name-resource-id> --grafana-resource-id <grafana-workspace-name-resource-id>
### Use optional parameters
az aks create/update --enable-azure-monitor-metrics --name <cluster-name> --resource-group <cluster-resource-group> --ksm-metric-labels-allow-list "namespaces=[k8s-label-1,k8s-label-n]" --ksm-metric-annotations-allow-list "pods=[k8s-annotation-1,k8s-annotation-n]"
Example
az aks create/update --enable-azure-monitor-metrics --name "my-cluster" --resource-group "my-resource-group" --azure-monitor-workspace-resource-id "/subscriptions/aaaa0a0a-bb1b-cc2c-dd3d-eeeeee4e4e4e/resourceGroups/my-resource-group/providers/microsoft.monitor/accounts/my-workspace"
Optional parameters
Each of the commands above allow the following optional parameters. The parameter name is different for each, but their use is the same.
Parameter | Name and Description |
---|---|
Annotation keys | --ksm-metric-annotations-allow-list Comma-separated list of Kubernetes annotations keys used in the resource's kube_resource_annotations metric. For example, kube_pod_annotations is the annotations metric for the pods resource. By default, this metric contains only name and namespace labels. To include more annotations, provide a list of resource names in their plural form and Kubernetes annotation keys that you want to allow for them. A single * can be provided for each resource to allow any annotations, but this has severe performance implications. For example, pods=[kubernetes.io/team,...],namespaces=[kubernetes.io/team],... . |
Label keys | --ksm-metric-labels-allow-list Comma-separated list of more Kubernetes label keys that is used in the resource's kube_resource_labels metric kube_resource_labels metric. For example, kube_pod_labels is the labels metric for the pods resource. By default this metric contains only name and namespace labels. To include more labels, provide a list of resource names in their plural form and Kubernetes label keys that you want to allow for them A single * can be provided for each resource to allow any labels, but i this has severe performance implications. For example, pods=[app],namespaces=[k8s-label-1,k8s-label-n,...],... . |
Recording rules | --enable-windows-recording-rules Lets you enable the recording rule groups required for proper functioning of the Windows dashboards. |
Container logs
Use the --addon monitoring
option with az aks create for a new cluster or az aks enable-addon to update an existing cluster to enable collection of container logs. See below to modify the log collection settings.
See the following examples.
### Use default Log Analytics workspace
az aks enable-addons --addon monitoring --name <cluster-name> --resource-group <cluster-resource-group-name>
### Use existing Log Analytics workspace
az aks enable-addons --addon monitoring --name <cluster-name> --resource-group <cluster-resource-group-name> --workspace-resource-id <workspace-resource-id>
### Use custom log configuration file
az aks enable-addons --addon monitoring --name <cluster-name> --resource-group <cluster-resource-group-name> --workspace-resource-id <workspace-resource-id> --data-collection-settings dataCollectionSettings.json
### Use legacy authentication
az aks enable-addons --addon monitoring --name <cluster-name> --resource-group <cluster-resource-group-name> --workspace-resource-id <workspace-resource-id> --enable-msi-auth-for-monitoring false
Example
az aks enable-addons --addon monitoring --name "my-cluster" --resource-group "my-resource-group" --workspace-resource-id "/subscriptions/aaaa0a0a-bb1b-cc2c-dd3d-eeeeee4e4e4e/resourceGroups/my-resource-group/providers/Microsoft.OperationalInsights/workspaces/my-workspace"
Log configuration file
To customize log collection settings for the cluster, you can provide the configuration as a JSON file using the following format. If you don't provide a configuration file, the default settings identified in the table below are used.
{
"interval": "1m",
"namespaceFilteringMode": "Include",
"namespaces": ["kube-system"],
"enableContainerLogV2": true,
"streams": ["Microsoft-Perf", "Microsoft-ContainerLogV2"]
}
Each of the settings in the configuration is described in the following table.
Name | Description |
---|---|
interval |
Determines how often the agent collects data. Valid values are 1m - 30m in 1m intervals If the value is outside the allowed range, then it defaults to 1 m. Default: 1m. |
namespaceFilteringMode |
Include: Collects only data from the values in the namespaces field. Exclude: Collects data from all namespaces except for the values in the namespaces field. Off: Ignores any namespace selections and collect data on all namespaces. Default: Off |
namespaces |
Array of comma separated Kubernetes namespaces to collect inventory and perf data based on the namespaceFilteringMode. For example, namespaces = ["kube-system", "default"] with an Include setting collects only these two namespaces. With an Exclude setting, the agent collects data from all other namespaces except for kube-system and default. With an Off setting, the agent collects data from all namespaces including kube-system and default. Invalid and unrecognized namespaces are ignored. None. |
enableContainerLogV2 |
Boolean flag to enable ContainerLogV2 schema. If set to true, the stdout/stderr Logs are ingested to ContainerLogV2 table. If not, the container logs are ingested to ContainerLog table, unless otherwise specified in the ConfigMap. When specifying the individual streams, you must include the corresponding table for ContainerLog or ContainerLogV2. Default: True |
streams |
An array of table streams. See Stream values for a list of the valid streams and their corresponding tables. Default: ContainerLogV2, KubeEvents, KubePodInventory |
Stream values
When you specify the tables to collect using CLI or ARM, you specify a stream name that corresponds to a particular table in the Log Analytics workspace. The following table lists the stream name for each table.
Note
If you're familiar with the structure of a data collection rule, the stream names in this table are specified in the Data flows section of the DCR.
Stream | Container insights table |
---|---|
Microsoft-ContainerInventory | ContainerInventory |
Microsoft-ContainerLog | ContainerLog |
Microsoft-ContainerLogV2 | ContainerLogV2 |
Microsoft-ContainerLogV2-HighScale | ContainerLogV2 (High scale mode)1 |
Microsoft-ContainerNodeInventory | ContainerNodeInventory |
Microsoft-InsightsMetrics | InsightsMetrics |
Microsoft-KubeEvents | KubeEvents |
Microsoft-KubeMonAgentEvents | KubeMonAgentEvents |
Microsoft-KubeNodeInventory | KubeNodeInventory |
Microsoft-KubePodInventory | KubePodInventory |
Microsoft-KubePVInventory | KubePVInventory |
Microsoft-KubeServices | KubeServices |
Microsoft-Perf | Perf |
1 Don't use both Microsoft-ContainerLogV2 and Microsoft-ContainerLogV2-HighScale together. This will result in duplicate data.
Applicable tables and metrics
The settings for collection frequency and namespace filtering don't apply to all log data. The following tables list the tables in the Log Analytics workspace along with the settings that apply to each.
Table name | Interval? | Namespaces? | Remarks |
---|---|---|---|
ContainerInventory | Yes | Yes | |
ContainerNodeInventory | Yes | No | Data collection setting for namespaces isn't applicable since Kubernetes Node isn't a namespace scoped resource |
KubeNodeInventory | Yes | No | Data collection setting for namespaces isn't applicable Kubernetes Node isn't a namespace scoped resource |
KubePodInventory | Yes | Yes | |
KubePVInventory | Yes | Yes | |
KubeServices | Yes | Yes | |
KubeEvents | No | Yes | Data collection setting for interval isn't applicable for the Kubernetes Events |
Perf | Yes | Yes | Data collection setting for namespaces isn't applicable for the Kubernetes Node related metrics since the Kubernetes Node isn't a namespace scoped object. |
InsightsMetrics | Yes | Yes | Data collection settings are only applicable for the metrics collecting the following namespaces: container.azm.ms/kubestate, container.azm.ms/pv and container.azm.ms/gpu |
Note
Namespace filtering does not apply to ama-logs agent records. As a result, even if the kube-system namespace is listed among excluded namespaces, records associated to ama-logs agent container will still be ingested.
Metric namespace | Interval? | Namespaces? | Remarks |
---|---|---|---|
Insights.container/nodes | Yes | No | Node isn't a namespace scoped resource |
Insights.container/pods | Yes | Yes | |
Insights.container/containers | Yes | Yes | |
Insights.container/persistentvolumes | Yes | Yes |
Special scenarios
Check the references below for configuration requirements for particular scenarios.
- If you're using private link, see Enable private link for Kubernetes monitoring in Azure Monitor.
- To enable high scale mode, follow the onboarding process at Enable high scale mode for Monitoring add-on. You must also ConfigMap as described in Update ConfigMap, and the DCR stream needs to be changed from
Microsoft-ContainerLogV2
toMicrosoft-ContainerLogV2-HighScale
.
Enable control plane logs
Control plane logs are implemented as resource logs in Azure Monitor. To collect these logs, create a diagnostic setting for the cluster. Send them to the same Log Analytics workspace as your container logs.
Use the az monitor diagnostic-settings create command to create a diagnostic setting with the Azure CLI. See the documentation for this command for descriptions of its parameters.
The following example creates a diagnostic setting that sends all Kubernetes categories to a Log Analytics workspace. This includes resource-specific mode to send the logs to specific tables listed in Supported resource logs for Microsoft.ContainerService/fleets.
az monitor diagnostic-settings create \
--name 'Collect control plane logs' \
--resource /subscriptions/<subscription ID>/resourceGroups/<resource group name>/providers/Microsoft.ContainerService/managedClusters/<cluster-name> \
--workspace /subscriptions/<subscription ID>/resourcegroups/<resource group name>/providers/microsoft.operationalinsights/workspaces/<log analytics workspace name> \
--logs '[{"category": "karpenter-events","enabled": true},{"category": "kube-audit","enabled": true},
{"category": "kube-apiserver","enabled": true},{"category": "kube-audit-admin","enabled": true},{"category": "kube-controller-manager","enabled": true},{"category": "kube-scheduler","enabled": true},{"category": "cluster-autoscaler","enabled": true},{"category": "cloud-controller-manager","enabled": true},{"category": "guard","enabled": true},{"category": "csi-azuredisk-controller","enabled": true},{"category": "csi-azurefile-controller","enabled": true},{"category": "csi-snapshot-controller","enabled": true},{"category": "fleet-member-agent","enabled": true},{"category": "fleet-member-net-controller-manager","enabled": true},{"category": "fleet-mcs-controller-manager","enabled": true}]'
--metrics '[{"category": "AllMetrics","enabled": true}]' \
--export-to-resource-specific true
Verify deployment
Use the kubectl command line tool to verify that the agent is deployed properly.
Managed Prometheus
Verify that the DaemonSet was deployed properly on the Linux node pools
kubectl get ds ama-metrics-node --namespace=kube-system
The number of pods should be equal to the number of Linux nodes on the cluster. The output should resemble the following example:
User@aksuser:~$ kubectl get ds ama-metrics-node --namespace=kube-system
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
ama-metrics-node 1 1 1 1 1 <none> 10h
Verify that Windows nodes were deployed properly
kubectl get ds ama-metrics-win-node --namespace=kube-system
The number of pods should be equal to the number of Windows nodes on the cluster. The output should resemble the following example:
User@aksuser:~$ kubectl get ds ama-metrics-node --namespace=kube-system
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
ama-metrics-win-node 3 3 3 3 3 <none> 10h
Verify that the two ReplicaSets were deployed for Prometheus
kubectl get rs --namespace=kube-system
The output should resemble the following example:
User@aksuser:~$kubectl get rs --namespace=kube-system
NAME DESIRED CURRENT READY AGE
ama-metrics-5c974985b8 1 1 1 11h
ama-metrics-ksm-5fcf8dffcd 1 1 1 11h
Container logging
Verify that the DaemonSets were deployed properly on the Linux node pools
kubectl get ds ama-logs --namespace=kube-system
The number of pods should be equal to the number of Linux nodes on the cluster. The output should resemble the following example:
User@aksuser:~$ kubectl get ds ama-logs --namespace=kube-system
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
ama-logs 2 2 2 2 2 <none> 1d
Verify that Windows nodes were deployed properly
kubectl get ds ama-logs-windows --namespace=kube-system
The number of pods should be equal to the number of Windows nodes on the cluster. The output should resemble the following example:
User@aksuser:~$ kubectl get ds ama-logs-windows --namespace=kube-system
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
ama-logs-windows 2 2 2 2 2 <none> 1d
Verify deployment of the container logging solution
kubectl get deployment ama-logs-rs --namespace=kube-system
The output should resemble the following example:
User@aksuser:~$ kubectl get deployment ama-logs-rs --namespace=kube-system
NAME READY UP-TO-DATE AVAILABLE AGE
ama-logs-rs 1/1 1 1 24d
View configuration with CLI
Use the aks show
command to find out whether the solution is enabled, the Log Analytics workspace resource ID, and summary information about the cluster.
az aks show --resource-group <resourceGroupofAKSCluster> --name <nameofAksCluster>
The command will return JSON-formatted information about the solution. The addonProfiles
section should include information on the omsagent
as in the following example:
"addonProfiles": {
"omsagent": {
"config": {
"logAnalyticsWorkspaceResourceID": "/subscriptions/aaaa0a0a-bb1b-cc2c-dd3d-eeeeee4e4e4e/resourcegroups/my-resource-group/providers/microsoft.operationalinsights/workspaces/my-workspace",
"useAADAuth": "true"
},
"enabled": true,
"identity": null
},
}
Next steps
- If you experience issues attempting to onboard, review the Troubleshooting guide.
- Learn how to Analyze Kubernetes monitoring data in the Azure portal Container insights.