启用对已部署的 Azure Kubernetes 服务 (AKS) 群集的监视Enable monitoring of Azure Kubernetes Service (AKS) cluster already deployed

本文介绍如何设置用于容器的 Azure Monitor,以监视订阅中已部署的 Azure Kubernetes 服务上托管的托管 Kubernetes 群集。This article describes how to set up Azure Monitor for containers to monitor managed Kubernetes cluster hosted on Azure Kubernetes Service that have already been deployed in your subscription.

可以使用下述支持的方法之一,以便启用对已部署的 AKS 群集的监视:You can enable monitoring of an AKS cluster that's already deployed using one of the supported methods:

登录到 Azure 门户Sign in to the Azure portal

登录到 Azure 门户Sign in to the Azure portal.

启用 Azure CLIEnable using Azure CLI

以下步骤使用 Azure CLI 对 AKS 群集启用监视。The following step enables monitoring of your AKS cluster using Azure CLI. 在此示例中,不需要预先创建或指定现有的工作区。In this example, you are not required to pre-create or specify an existing workspace. 如果区域中尚不存在默认的工作区,此命令可以简化在 AKS 群集订阅的默认资源组中创建默认工作区的过程。This command simplifies the process for you by creating a default workspace in the default resource group of the AKS cluster subscription if one does not already exist in the region. 创建的默认工作区类似于 DefaultWorkspace-<GUID>-<Region> 格式。The default workspace created resembles the format of DefaultWorkspace-<GUID>-<Region>.

az aks enable-addons -a monitoring -n MyExistingManagedCluster -g MyExistingManagedClusterRG

输出如下所示:The output will resemble the following:

provisioningState       : Succeeded

与现有工作区集成Integrate with an existing workspace

如果要与现有工作区集成,请执行以下步骤,首先确定 --workspace-resource-id 参数所需的 Log Analytics 工作区的完整资源 ID,然后运行命令以针对指定的工作区启用监视加载项。If you would rather integrate with an existing workspace, perform the following steps to first identify the full resource ID of your Log Analytics workspace required for the --workspace-resource-id parameter, and then run the command to enable the monitoring add-on against the specified workspace.

  1. 使用以下命令列出你有权访问的所有订阅:List all the subscriptions that you have access to using the following command:

    az account list --all -o table
    

    输出如下所示:The output will resemble the following:

    Name                                  CloudName    SubscriptionId                        State    IsDefault
    ------------------------------------  -----------  ------------------------------------  -------  -----------
    Azure                       AzureChinaCloud   68627f8c-91fO-4905-z48q-b032a81f8vy0  Enabled  True
    

    复制 SubscriptionId 的值。Copy the value for SubscriptionId.

  2. 使用以下命令切换到托管 Log Analytics 工作区的订阅:Switch to the subscription hosting the Log Analytics workspace using the following command:

    az account set -s <subscriptionId of the workspace>
    
  3. 以下示例以默认 JSON 格式显示订阅中的工作区列表。The following example displays the list of workspaces in your subscriptions in the default JSON format.

    az resource list --resource-type Microsoft.OperationalInsights/workspaces -o json
    

    在输出中,找到工作区名称,然后在字段 id 下复制该 Log Analytics 工作区的完整资源 ID。In the output, find the workspace name, and then copy the full resource ID of that Log Analytics workspace under the field id.

  4. 运行以下命令以启用监视加载项,并替换 --workspace-resource-id 参数的值。Run the following command to enable the monitoring add-on, replacing the value for the --workspace-resource-id parameter. 字符串值必须在双引号内:The string value must be within the double quotes:

    az aks enable-addons -a monitoring -n ExistingManagedCluster -g ExistingManagedClusterRG --workspace-resource-id "/subscriptions/<SubscriptionId>/resourceGroups/<ResourceGroupName>/providers/Microsoft.OperationalInsights/workspaces/<WorkspaceName>"
    

    输出如下所示:The output will resemble the following:

    provisioningState       : Succeeded
    

使用 TerraformEnable using Terraform

  1. 将 oms_agent 附加配置文件添加到现有 azurerm_kubernetes_cluster 资源Add the oms_agent add-on profile to the existing azurerm_kubernetes_cluster resource

    addon_profile {
     oms_agent {
       enabled                    = true
       log_analytics_workspace_id = "${azurerm_log_analytics_workspace.test.id}"
      }
    }
    
  2. 按照 Terraform 文档中的步骤添加 azurerm_log_analytics_solutionAdd the azurerm_log_analytics_solution following the steps in the Terraform documentation.

在门户中通过 Azure Monitor 来启用Enable from Azure Monitor in the portal

要启用 Azure Monitor 对 Azure 门户中的 AKS 群集的监视,请执行以下操作:To enable monitoring of your AKS cluster in the Azure portal from Azure Monitor, do the following:

  1. 在 Azure 门户中选择“监视”。In the Azure portal, select Monitor.

  2. 从列表中选择容器 。Select Containers from the list.

  3. 在“监视 - 容器”页上,选择“未监视的群集” 。On the Monitor - containers page, select Unmonitored clusters.

  4. 从未监视的群集列表中找到容器,然后单击“启用”。From the list of unmonitored clusters, find the container in the list and click Enable.

  5. 在“载入到用于容器的 Azure Monitor”页上,如果现有 Log Analytics 工作区与群集在同一订阅中,请从下拉列表中选择该工作区 。On the Onboarding to Azure Monitor for containers page, if you have an existing Log Analytics workspace in the same subscription as the cluster, select it from the drop-down list. 列表预先选择了 AKS 容器在订阅中部署到的默认工作区和位置。The list preselects the default workspace and location that the AKS container is deployed to in the subscription.

    启用 AKS 容器见解监视

    备注

    如果想要创建新的 Log Analytics 工作区用于存储来自群集的监视数据,请按照创建 Log Analytics 工作区中的说明进行操作。If you want to create a new Log Analytics workspace for storing the monitoring data from the cluster, follow the instructions in Create a Log Analytics workspace. 确保在部署 AKS 容器的同一订阅中创建工作区。Be sure to create the workspace in the same subscription that the AKS container is deployed to.

启用监视后,可能需要约 15 分钟才能查看群集的运行状况指标。After you've enabled monitoring, it might take about 15 minutes before you can view health metrics for the cluster.

在门户中直接使用 AKS 群集来启用Enable directly from AKS cluster in the portal

若要在 Azure 门户中直接使用某个 AKS 群集来启用监视,请执行以下操作:To enable monitoring directly from one of your AKS clusters in the Azure portal, do the following:

  1. 在 Azure 门户中,选择“所有服务”。In the Azure portal, select All services.

  2. 在资源列表中,开始键入“Containers” 。In the list of resources, begin typing Containers. 列表会根据输入的内容进行筛选。The list filters based on your input.

  3. 选择“Kubernetes 服务” 。Select Kubernetes services.

  4. 在 Kubernetes 服务列表中,选择一个服务。In the list of Kubernetes services, select a service.

  5. 在“Kubernetes 服务概述”页上,选择“监视 - 见解”。On the Kubernetes service overview page, select Monitoring - Insights.

  6. 在“载入到用于容器的 Azure Monitor”页上,如果现有 Log Analytics 工作区与群集在同一订阅中,请从下拉列表中选择该工作区 。On the Onboarding to Azure Monitor for containers page, if you have an existing Log Analytics workspace in the same subscription as the cluster, select it in the drop-down list. 列表预先选择了 AKS 容器在订阅中部署到的默认工作区和位置。The list preselects the default workspace and location that the AKS container is deployed to in the subscription.

    启用 AKS 容器运行状况监视

    备注

    如果想要创建新的 Log Analytics 工作区用于存储来自群集的监视数据,请按照创建 Log Analytics 工作区中的说明进行操作。If you want to create a new Log Analytics workspace for storing the monitoring data from the cluster, follow the instructions in Create a Log Analytics workspace. 确保在部署 AKS 容器的同一订阅中创建工作区。Be sure to create the workspace in the same subscription that the AKS container is deployed to.

启用监视后,可能需要约 15 分钟才能查看群集的运行数据。After you've enabled monitoring, it might take about 15 minutes before you can view operational data for the cluster.

使用 Azure 资源管理器模板来启用Enable using an Azure Resource Manager template

此方法包含两个 JSON 模板。This method includes two JSON templates. 一个模板指定用于启用监视的配置,另一个模板包含参数值,通过配置这些参数值可指定:One template specifies the configuration to enable monitoring, and the other contains parameter values that you configure to specify the following:

  • AKS 容器资源 ID。The AKS container resource ID.
  • 在其中部署群集的资源组。The resource group that the cluster is deployed in.

备注

模板需要部署在群集所在的资源组中。The template needs to be deployed in the same resource group as the cluster.

必须创建 Log Analytics 工作区,然后才能使用 Azure PowerShell 或 CLI 来启用监视。The Log Analytics workspace has to be created before you enable monitoring using Azure PowerShell or CLI. 若要创建工作区,可通过 Azure 资源管理器PowerShell 或在 Azure 门户中进行设置。To create the workspace, you can set it up through Azure Resource Manager, through PowerShell, or in the Azure portal.

如果不熟悉使用模板部署资源的概念,请参阅:If you are unfamiliar with the concept of deploying resources by using a template, see:

如果选择使用 Azure CLI,首先需要在本地安装和使用 CLI。If you choose to use the Azure CLI, you first need to install and use the CLI locally. 必须运行 Azure CLI 2.0.59 或更高版本。You must be running the Azure CLI version 2.0.59 or later. 若要确定版本,请运行 az --versionTo identify your version, run az --version. 如果需要安装或升级 Azure CLI,请参阅安装 Azure CLIIf you need to install or upgrade the Azure CLI, see Install the Azure CLI.

创建和执行模板Create and execute a template

  1. 将以下 JSON 语法复制并粘贴到该文件中:Copy and paste the following JSON syntax into your file:

    {
      "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
      "contentVersion": "1.0.0.0",
      "parameters": {
        "aksResourceId": {
          "type": "string",
          "metadata": {
            "description": "AKS Cluster Resource ID"
          }
        },
        "aksResourceLocation": {
          "type": "string",
          "metadata": {
            "description": "Location of the AKS resource e.g. \"China East2\""
          }
        },
        "aksResourceTagValues": {
          "type": "object",
          "metadata": {
            "description": "Existing all tags on AKS Cluster Resource"
          }
        },
        "workspaceResourceId": {
          "type": "string",
          "metadata": {
            "description": "Azure Monitor Log Analytics Resource ID"
          }
        }
      },
      "resources": [
        {
          "name": "[split(parameters('aksResourceId'),'/')[8]]",
          "type": "Microsoft.ContainerService/managedClusters",
          "location": "[parameters('aksResourceLocation')]",
          "tags": "[parameters('aksResourceTagValues')]",
          "apiVersion": "2018-03-31",
          "properties": {
            "mode": "Incremental",
            "id": "[parameters('aksResourceId')]",
            "addonProfiles": {
              "omsagent": {
                "enabled": true,
                "config": {
                  "logAnalyticsWorkspaceResourceID": "[parameters('workspaceResourceId')]"
                }
              }
            }
          }
        }
      ]
    }
    
  2. 将此文件以“existingClusterOnboarding.json”文件名保存到本地文件夹 。Save this file as existingClusterOnboarding.json to a local folder.

  3. 将以下 JSON 语法粘贴到文件中:Paste the following JSON syntax into your file:

    {
      "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentParameters.json#",
      "contentVersion": "1.0.0.0",
      "parameters": {
        "aksResourceId": {
          "value": "/subscriptions/<SubscriptionId>/resourcegroups/<ResourceGroup>/providers/Microsoft.ContainerService/managedClusters/<ResourceName>"
        },
        "aksResourceLocation": {
          "value": "<aksClusterLocation>"
        },
        "workspaceResourceId": {
          "value": "/subscriptions/<SubscriptionId>/resourceGroups/<ResourceGroup>/providers/Microsoft.OperationalInsights/workspaces/<workspaceName>"
        },
        "aksResourceTagValues": {
          "value": {
            "<existing-tag-name1>": "<existing-tag-value1>",
            "<existing-tag-name2>": "<existing-tag-value2>",
            "<existing-tag-nameN>": "<existing-tag-valueN>"
          }
        }
      }
    }
    
  4. 使用 AKS 群集的“AKS 概述”页面中的值,编辑 aksResourceIdaksResourceLocation 的值 。Edit the values for aksResourceId and aksResourceLocation using the values on the AKS Overview page for the AKS cluster. workspaceResourceId 的值是 Log Analytics 工作区的完整资源 ID,其中包含工作区名称。The value for workspaceResourceId is the full resource ID of your Log Analytics workspace, which includes the workspace name.

    编辑 aksResourceTagValues 的值,以匹配为 AKS 群集指定的现有标记值。Edit the values for aksResourceTagValues to match the existing tag values specified for the AKS cluster.

  5. 将此文件以“existingClusterParam.json”文件名保存到本地文件夹 。Save this file as existingClusterParam.json to a local folder.

  6. 已做好部署此模板的准备。You are ready to deploy this template.

    • 若要使用 Azure PowerShell 进行部署,请在包含模板的文件夹中使用以下命令:To deploy with Azure PowerShell, use the following commands in the folder that contains the template:

      New-AzResourceGroupDeployment -Name OnboardCluster -ResourceGroupName <ResourceGroupName> -TemplateFile .\existingClusterOnboarding.json -TemplateParameterFile .\existingClusterParam.json
      

      配置更改可能需要几分钟才能完成。The configuration change can take a few minutes to complete. 完成后,系统会显示包含结果的消息,如下所示:When it's completed, a message is displayed that's similar to the following and includes the result:

      provisioningState       : Succeeded
      
    • 若要使用 Azure CLI 进行部署,请运行下列命令:To deploy with Azure CLI, run the following commands:

      az cloud set --name AzureChinaCloud
      az login
      az account set --subscription "Subscription Name"
      az group deployment create --resource-group <ResourceGroupName> --template-file ./existingClusterOnboarding.json --parameters @./existingClusterParam.json
      

      配置更改可能需要几分钟才能完成。The configuration change can take a few minutes to complete. 完成后,系统会显示包含结果的消息,如下所示:When it's completed, a message is displayed that's similar to the following and includes the result:

      provisioningState       : Succeeded
      

      启用监视后,可能需要约 15 分钟才能查看群集的运行状况指标。After you've enabled monitoring, it might take about 15 minutes before you can view health metrics for the cluster.

验证代理和解决方案部署Verify agent and solution deployment

如果代理版本为 06072018 或更高版本,则可验证代理和解决方案是否均已成功部署。With agent version 06072018 or later, you can verify that both the agent and the solution were deployed successfully. 如果是早期版本的代理,则只能验证代理的部署情况。With earlier versions of the agent, you can verify only agent deployment.

06072018 版或更高版本的代理Agent version 06072018 or later

运行以下命令,验证代理是否已成功部署。Run the following command to verify that the agent is deployed successfully.

kubectl get ds omsagent --namespace=kube-system

输出应如下所示,指明其已正确部署:The output should resemble the following, which indicates that it was deployed properly:

User@aksuser:~$ kubectl get ds omsagent --namespace=kube-system
NAME       DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE SELECTOR                 AGE
omsagent   2         2         2         2            2           beta.kubernetes.io/os=linux   1d

如果群集上有 Windows Server 节点,则可运行以下命令,验证代理是否已成功部署。If there are Windows Server nodes on the cluster then you can run the following command to verify that the agent is deployed successfully.

kubectl get ds omsagent-win --namespace=kube-system

输出应如下所示,指明其已正确部署:The output should resemble the following, which indicates that it was deployed properly:

User@aksuser:~$ kubectl get ds omsagent-win --namespace=kube-system
NAME                   DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE SELECTOR                   AGE
omsagent-win           2         2         2         2            2           beta.kubernetes.io/os=windows   1d

若要验证解决方案的部署,请运行以下命令:To verify deployment of the solution, run the following command:

kubectl get deployment omsagent-rs -n=kube-system

输出应如下所示,指明其已正确部署:The output should resemble the following, which indicates that it was deployed properly:

User@aksuser:~$ kubectl get deployment omsagent-rs -n=kube-system
NAME       DESIRED   CURRENT   UP-TO-DATE   AVAILABLE    AGE
omsagent   1         1         1            1            3h

代理版本低于 06072018Agent version earlier than 06072018

若要验证 06072018 之前发布的 Log Analytics 代理版本是否已正确部署,请运行以下命令:To verify that the Log Analytics agent version released before 06072018 is deployed properly, run the following command:

kubectl get ds omsagent --namespace=kube-system

输出应如下所示,指明其已正确部署:The output should resemble the following, which indicates that it was deployed properly:

User@aksuser:~$ kubectl get ds omsagent --namespace=kube-system
NAME       DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE SELECTOR                 AGE
omsagent   2         2         2         2            2           beta.kubernetes.io/os=linux   1d

使用 CLI 查看配置View configuration with CLI

使用 aks show 命令获取详细信息,例如,解决方案是否已启用、Log Analytics 工作区 resourceID 是什么,以及有关群集的摘要详细信息。Use the aks show command to get details such as is the solution enabled or not, what is the Log Analytics workspace resourceID, and summary details about the cluster.

az aks show -g <resourceGroupofAKSCluster> -n <nameofAksCluster>

片刻之后,该命令将会完成,并返回有关解决方案的 JSON 格式信息。After a few minutes, the command completes and returns JSON-formatted information about solution. 命令结果应显示监视加载项配置文件,并类似于以下示例输出:The results of the command should show the monitoring add-on profile and resembles the following example output:

"addonProfiles": {
    "omsagent": {
      "config": {
        "logAnalyticsWorkspaceResourceID": "/subscriptions/<WorkspaceSubscription>/resourceGroups/<DefaultWorkspaceRG>/providers/Microsoft.OperationalInsights/workspaces/<defaultWorkspaceName>"
      },
      "enabled": true
    }
  }

后续步骤Next steps

  • 如果在尝试载入解决方案时遇到问题,请查看故障排除指南If you experience issues while attempting to onboard the solution, review the troubleshooting guide

  • 启用监视以收集 AKS 群集及其上运行的工作负荷的运行状况和资源利用率,了解如何使用用于容器的 Azure Monitor。With monitoring enabled to collect health and resource utilization of your AKS cluster and workloads running on them, learn how to use Azure Monitor for containers.