为混合 Kubernetes 群集配置适用于容器的 Azure MonitorConfigure hybrid Kubernetes clusters with Azure Monitor for containers

适用于容器的 Azure Monitor 针对 Azure Kubernetes 服务 (AKS) 和 Azure 上的 AKS 引擎(Azure 上承载的一个自我管理的 Kubernetes 群集)提供了丰富的监视体验。Azure Monitor for containers provides rich monitoring experience for the Azure Kubernetes Service (AKS) and AKS Engine on Azure, which is a self-managed Kubernetes cluster hosted on Azure. 本文介绍了如何对在 Azure 外部承载的 Kubernetes 群集启用监视,并实现类似的监视体验。This article describes how to enable monitoring of Kubernetes clusters hosted outside of Azure and achieve a similar monitoring experience.

支持的配置Supported configurations

适用于容器的 Azure Monitor 正式支持以下配置。The following configurations are officially supported with Azure Monitor for containers.

  • 环境:Environments:

    • 本地 KubernetesKubernetes on-premises

    • Azure 和 Azure Stack 的 AKS 引擎。AKS Engine on Azure and Azure Stack. 有关详细信息,请参阅 Azure Stack 上的 AKS 引擎For more information, see AKS Engine on Azure Stack

    • OpenShift 版本 4 及更高版本,位于本地或其他云环境中。OpenShift version 4 and higher, on-premises or other cloud environments.

  • Kubernetes 和支持策略的版本与 AKS 支持的版本相同。Versions of Kubernetes and support policy are the same as versions of AKS supported.

  • 支持以下容器运行时:Docker、Moby 和 CRI 兼容的运行时,例如 CRI-O 和 ContainerD。The following container runtimes are supported: Docker, Moby, and CRI compatible runtimes such CRI-O and ContainerD.

  • 支持的适用于主节点和工作器节点的 Linux OS 版本包括:Ubuntu(18.04 LTS 和 16.04 LTS)和 Red Hat Enterprise Linux CoreOS 43.81。Linux OS release for master and worker nodes supported are: Ubuntu (18.04 LTS and 16.04 LTS), and Red Hat Enterprise Linux CoreOS 43.81.

  • 支持的访问控制:Kubernetes RBAC 和非 RBACAccess control supported: Kubernetes RBAC and non-RBAC

先决条件Prerequisites

在开始之前,请确保做好以下准备:Before you start, make sure that you have the following:

  • Log Analytics 工作区A Log Analytics workspace.

    用于容器的 Azure Monitor 支持在 Azure 产品(按区域) 中列出的区域中的 Log Analytics 工作区。Azure Monitor for containers supports a Log Analytics workspace in the regions listed in Azure Products by region. 若要创建你自己的工作区,可通过 Azure 资源管理器PowerShellAzure 门户进行创建。To create your own workspace, it can be created through Azure Resource Manager, through PowerShell, or in the Azure portal.

    备注

    不支持对同一 Log Analytics 工作区中具有相同群集名称的多个群集启用监视。Enable monitoring of multiple clusters with the same cluster name to same Log Analytics workspace is not supported. 群集名称必须独一无二。Cluster names must be unique.

  • 需要成为 Log Analytics 参与者角色的成员才能启用容器监视。You are a member of the Log Analytics contributor role to enable container monitoring. 要详细了解如何控制对 Log Analytics 工作区的访问,请参阅管理对工作区和日志数据的访问For more information about how to control access to a Log Analytics workspace, see Manage access to workspace and log data.

  • 若要查看监视数据,需要在 Log Analytics 工作区(该工作区为容器配置了 Azure Monitor)中拥有 Log Analytics 读者角色。To view the monitoring data, you need to have Log Analytics reader role in the Log Analytics workspace, configured with Azure Monitor for containers.

  • HELM 客户端,用于为指定的 Kubernetes 群集载入适用于容器的 Azure Monitor 图表。HELM client to onboard the Azure Monitor for containers chart for the specified Kubernetes cluster.

  • 适用于 Linux 的 Log Analytics 代理的容器化版本与 Azure Monitor 进行通信需要以下代理和防火墙配置信息:The following proxy and firewall configuration information is required for the containerized version of the Log Analytics agent for Linux to communicate with Azure Monitor:

    代理资源Agent Resource 端口Ports
    *.ods.opinsights.azure.com*.ods.opinsights.azure.com 端口 443Port 443
    *.oms.opinsights.azure.com*.oms.opinsights.azure.com 端口 443Port 443
    *.dc.services.visualstudio.com*.dc.services.visualstudio.com 端口 443Port 443
  • 容器化代理要求在群集的所有节点上打开 Kubelet 的 cAdvisor secure port: 10250unsecure port :10255 以收集性能指标。The containerized agent requires Kubelet's cAdvisor secure port: 10250 or unsecure port :10255 to be opened on all nodes in the cluster to collect performance metrics. 建议你在 Kubelet 的 cAdvisor 上配置 secure port: 10250(如果尚未配置)。We recommend you configure secure port: 10250 on the Kubelet's cAdvisor if it's not configured already.

  • 容器化代理要求在容器上指定以下环境变量,以便与群集中的 Kubernetes API 服务通信以收集清单数据 - KUBERNETES_SERVICE_HOSTKUBERNETES_PORT_443_TCP_PORTThe containerized agent requires the following environmental variables to be specified on the container in order to communicate with the Kubernetes API service within the cluster to collect inventory data - KUBERNETES_SERVICE_HOST and KUBERNETES_PORT_443_TCP_PORT.

重要

监视混合 Kubernetes 群集时支持使用的最低代理版本是 ciprod10182019 或更高版本。The minimum agent version supported for monitoring hybrid Kubernetes clusters is ciprod10182019 or later.

启用监视Enable monitoring

为混合 Kubernetes 群集启用适用于容器的 Azure Monitor 的操作包括按顺序执行以下步骤。Enabling Azure Monitor for containers for the hybrid Kubernetes cluster consists of performing the following steps in order.

  1. 为 Log Analytics 工作区配置容器见解解决方案。Configure your Log Analytics workspace with Container Insights solution.

  2. 通过 Log Analytics 工作区启用适用于容器的 Azure Monitor 的 HELM 图表。Enable the Azure Monitor for containers HELM chart with Log Analytics workspace.

如何添加 Azure Monitor 容器解决方案How to add the Azure Monitor Containers solution

可以使用 Azure PowerShell cmdlet New-AzResourceGroupDeployment 或 Azure CLI,通过提供的 Azure 资源管理器模板来部署解决方案。You can deploy the solution with the provided Azure Resource Manager template by using the Azure PowerShell cmdlet New-AzResourceGroupDeployment or with Azure CLI.

如果不熟悉使用模板部署资源的概念,请参阅:If you are unfamiliar with the concept of deploying resources by using a template, see:

如果选择使用 Azure CLI,首先需要在本地安装和使用 CLI。If you choose to use the Azure CLI, you first need to install and use the CLI locally. 必须运行 Azure CLI 2.0.59 或更高版本。You must be running the Azure CLI version 2.0.59 or later. 若要确定版本,请运行 az --versionTo identify your version, run az --version. 如果需要安装或升级 Azure CLI,请参阅安装 Azure CLIIf you need to install or upgrade the Azure CLI, see Install the Azure CLI.

此方法包含两个 JSON 模板。This method includes two JSON templates. 一个模板指定用于启用监视的配置,另一个模板包含参数值,通过配置这些参数值可指定:One template specifies the configuration to enable monitoring, and the other contains parameter values that you configure to specify the following:

  • workspaceResourceId - Log Analytics 工作区的完整资源 ID。workspaceResourceId - the full resource ID of your Log Analytics workspace.
  • workspaceRegion - 在其中创建工作区的区域。当从 Azure 门户中查看时,它在工作区属性中也称作“位置”。workspaceRegion - the region the workspace is created in, which is also referred to as Location in the workspace properties when viewing from the Azure portal.

若要首先确定 containerSolutionParams.json 文件中的 workspaceResourceId 参数值所需的 Log Analytics 工作区的完整资源 ID,请执行以下步骤,然后运行 PowerShell cmdlet 或 Azure CLI 命令来添加解决方案。To first identify the full resource ID of your Log Analytics workspace required for the workspaceResourceId parameter value in the containerSolutionParams.json file, perform the following steps and then run the PowerShell cmdlet or Azure CLI command to add the solution.

  1. 使用以下命令列出你有权访问的所有订阅:List all the subscriptions that you have access to using the following command:

    az account list --all -o table
    

    输出如下所示:The output will resemble the following:

    Name                                  CloudName    SubscriptionId                        State    IsDefault
    ------------------------------------  -----------  ------------------------------------  -------  -----------
    Azure                       AzureChinaCloud   0fb60ef2-03cc-4290-b595-e71108e8f4ce  Enabled  True
    

    复制 SubscriptionId 的值。Copy the value for SubscriptionId.

  2. 使用以下命令切换到托管 Log Analytics 工作区的订阅:Switch to the subscription hosting the Log Analytics workspace using the following command:

    az account set -s <subscriptionId of the workspace>
    
  3. 以下示例以默认 JSON 格式显示订阅中的工作区列表。The following example displays the list of workspaces in your subscriptions in the default JSON format.

    az resource list --resource-type Microsoft.OperationalInsights/workspaces -o json
    

    在输出中,找到工作区名称,然后在字段 ID 下复制该 Log Analytics 工作区的完整资源 ID。In the output, find the workspace name, and then copy the full resource ID of that Log Analytics workspace under the field ID.

  4. 将以下 JSON 语法复制并粘贴到该文件中:Copy and paste the following JSON syntax into your file:

    {
    "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
    "contentVersion": "1.0.0.0",
    "parameters": {
        "workspaceResourceId": {
            "type": "string",
            "metadata": {
                "description": "Azure Monitor Log Analytics Workspace Resource ID"
            }
        },
        "workspaceRegion": {
            "type": "string",
            "metadata": {
                "description": "Azure Monitor Log Analytics Workspace region"
            }
        }
    },
    "resources": [
        {
            "type": "Microsoft.Resources/deployments",
            "name": "[Concat('ContainerInsights', '-',  uniqueString(parameters('workspaceResourceId')))]",
            "apiVersion": "2017-05-10",
            "subscriptionId": "[split(parameters('workspaceResourceId'),'/')[2]]",
            "resourceGroup": "[split(parameters('workspaceResourceId'),'/')[4]]",
            "properties": {
                "mode": "Incremental",
                "template": {
                    "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
                    "contentVersion": "1.0.0.0",
                    "parameters": {},
                    "variables": {},
                    "resources": [
                        {
                            "apiVersion": "2015-11-01-preview",
                            "type": "Microsoft.OperationsManagement/solutions",
                            "location": "[parameters('workspaceRegion')]",
                            "name": "[Concat('ContainerInsights', '(', split(parameters('workspaceResourceId'),'/')[8], ')')]",
                            "properties": {
                                "workspaceResourceId": "[parameters('workspaceResourceId')]"
                            },
                            "plan": {
                                "name": "[Concat('ContainerInsights', '(', split(parameters('workspaceResourceId'),'/')[8], ')')]",
                                "product": "[Concat('OMSGallery/', 'ContainerInsights')]",
                                "promotionCode": "",
                                "publisher": "Microsoft"
                            }
                        }
                    ]
                },
                "parameters": {}
            }
         }
      ]
    }
    
  5. 在一个本地文件夹中将该文件另存为 containerSolution.json。Save this file as containerSolution.json to a local folder.

  6. 将以下 JSON 语法粘贴到文件中:Paste the following JSON syntax into your file:

    {
      "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentParameters.json#",
      "contentVersion": "1.0.0.0",
      "parameters": {
        "workspaceResourceId": {
          "value": "<workspaceResourceId>"
      },
      "workspaceRegion": {
        "value": "<workspaceRegion>"
      }
     }
    }
    
  7. 使用在第 3 步复制的值编辑 workspaceResourceId 的值。对于 workspaceRegion,请在运行 Azure CLI 命令 az monitor log-analytics workspace show 后复制 Region 值。Edit the values for workspaceResourceId using the value you copied in step 3, and for workspaceRegion copy the Region value after running the Azure CLI command az monitor log-analytics workspace show.

  8. 在一个本地文件夹中将该文件另存为 containerSolutionParams.json。Save this file as containerSolutionParams.json to a local folder.

  9. 已做好部署此模板的准备。You are ready to deploy this template.

    • 若要使用 Azure PowerShell 进行部署,请在包含模板的文件夹中使用以下命令:To deploy with Azure PowerShell, use the following commands in the folder that contains the template:

      # configure and login to the cloud of Log Analytics workspace.Specify the corresponding cloud environment of your workspace to below command.
      Connect-AzureRmAccount -Environment AzureChinaCloud
      
      # set the context of the subscription of Log Analytics workspace
      Set-AzureRmContext -SubscriptionId <subscription Id of Log Analytics workspace>
      
      # execute deployment command to add Container Insights solution to the specified Log Analytics workspace
      New-AzureRmResourceGroupDeployment -Name OnboardCluster -ResourceGroupName <resource group of Log Analytics workspace> -TemplateFile .\containerSolution.json -TemplateParameterFile .\containerSolutionParams.json
      

      配置更改可能需要几分钟才能完成。The configuration change can take a few minutes to complete. 完成后,系统会显示包含结果的消息,如下所示:When it's completed, a message is displayed that's similar to the following and includes the result:

      provisioningState       : Succeeded
      
    • 若要使用 Azure CLI 进行部署,请运行下列命令:To deploy with Azure CLI, run the following commands:

      az login
      az account set --name AzureChinaCloud
      az login
      az account set --subscription "Subscription Name"
      # execute deployment command to add container insights solution to the specified Log Analytics workspace
      az deployment group create --resource-group <resource group of log analytics workspace> --name <deployment name> --template-file  ./containerSolution.json --parameters @./containerSolutionParams.json
      

      配置更改可能需要几分钟才能完成。The configuration change can take a few minutes to complete. 完成后,系统会显示包含结果的消息,如下所示:When it's completed, a message is displayed that's similar to the following and includes the result:

      provisioningState       : Succeeded
      

      启用监视后,可能需要约 15 分钟才能查看群集的运行状况指标。After you've enabled monitoring, it might take about 15 minutes before you can view health metrics for the cluster.

安装 HELM 图表Install the HELM chart

本部分介绍如何为适用于容器的 Azure Monitor 安装容器化代理。In this section you install the containerized agent for Azure Monitor for containers. 在继续之前,需要先确定 omsagent.secret.wsid 参数所需的工作区 ID 和 omsagent.secret.key 参数所需的主密钥。Before proceeding, you need to identify the workspace ID required for the omsagent.secret.wsid parameter, and primary key required for the omsagent.secret.key parameter. 可以通过执行以下步骤来确定这些信息,然后运行命令来安装使用 HELM 图表的代理。You can identify this information by performing the following steps, and then run the commands to install the agent using the HELM chart.

  1. 运行以下命令以确定工作区 ID:Run the following command to identify the workspace ID:

    az monitor log-analytics workspace list --resource-group <resourceGroupName>

    在输出中,在“name”字段下找到工作区名称,然后在“customerID”下复制该 Log Analytics 工作区的工作区 ID 。In the output, find the workspace name under the field name, and then copy the workspace ID of that Log Analytics workspace under the field customerID.

  2. 运行以下命令以确定工作区的主密钥:Run the following command to identify the primary key for the workspace:

    az monitor log-analytics workspace get-shared-keys --resource-group <resourceGroupName> --workspace-name <logAnalyticsWorkspaceName>

    在输出中,在“primarySharedKey”下找到主密钥,然后复制其值。In the output, find the primary key under the field primarySharedKey, and then copy the value.

备注

以下命令仅适用于 Helm 版本 2。The following commands are applicable only for Helm version 2. --name 参数不适合在 Helm 版本 3 中使用。Use of the --name parameter is not applicable with Helm version 3.

备注

如果 Kubernetes 群集通过代理服务器进行通信,则使用代理服务器的 URL 来配置参数 omsagent.proxyIf your Kubernetes cluster communicates through a proxy server, configure the parameter omsagent.proxy with the URL of the proxy server. 如果群集不是通过代理服务器进行通信,则无需指定此参数。If the cluster does not communicate through a proxy server, then you don't need to specify this parameter. 有关详细信息,请参阅本文稍后的配置代理终结点For more information, see Configure proxy endpoint later in this article.

  1. 通过运行以下命令,将 Azure 图表存储库添加到你的本地列表:Add the Azure charts repository to your local list by running the following command:

    helm repo add incubator https://kubernetes-charts-incubator.storage.googleapis.com/
    
  2. 运行以下命令来安装图表:Install the chart by running the following command:

    $ helm install --name myrelease-1 \
    --set omsagent.secret.wsid=<logAnalyticsWorkspaceId>,omsagent.secret.key=<logAnalyticsWorkspaceKey>,omsagent.env.clusterName=<my_prod_cluster> incubator/azuremonitor-containers
    

    运行以下命令:Run the following command:

    $ helm install --name myrelease-1 \
     --set omsagent.domain=opinsights.azure.cn,omsagent.secret.wsid=<logAnalyticsWorkspaceId>,omsagent.secret.key=<logAnalyticsWorkspaceKey>,omsagent.env.clusterName=<your_cluster_name> incubator/azuremonitor-containers
    

通过 API 模型启用 Helm 图表Enable the Helm chart using the API Model

你可以在 AKS 引擎群集规范 json 文件中指定一个加载项,也称为 API 模型。You can specify an addon in the AKS Engine cluster specification json file, also referred to as the API Model. 在此加载项中,提供存储所收集监视数据的 Log Analytics 工作区的 base64 编码版 WorkspaceGUIDWorkspaceKeyIn this addon, provide the base64 encoded version of WorkspaceGUID and WorkspaceKey of the Log Analytics workspace where the collected monitoring data is stored. 可以使用上一部分中的步骤 1 和 2 来查找 WorkspaceGUIDWorkspaceKeyYou can find the WorkspaceGUID and WorkspaceKey using steps 1 and 2 in the previous section.

在此示例中,可以找到 Azure Stack Hub 群集支持的 API 定义 - kubernetes-container-monitoring_existing_workspace_id_and_key.jsonSupported API definitions for the Azure Stack Hub cluster can be found in this example - kubernetes-container-monitoring_existing_workspace_id_and_key.json. 具体而言,请在 kubernetesConfig 中查找 addons 属性:Specifically, find the addons property in kubernetesConfig:

"orchestratorType": "Kubernetes",
       "kubernetesConfig": {
         "addons": [
           {
             "name": "container-monitoring",
             "enabled": true,
             "config": {
               "workspaceGuid": "<Azure Log Analytics Workspace Id in Base-64 encoded>",
               "workspaceKey": "<Azure Log Analytics Workspace Key in Base-64 encoded>"
             }
           }
         ]
       }

配置代理数据收集Configure agent data collection

从图表版本 1.0.0 开始,可通过 ConfigMap 控制代理数据收集设置。Staring with chart version 1.0.0, the agent data collection settings are controlled from the ConfigMap. 有关代理数据收集设置的文档,请参阅此文Refer to documentation about agent data collection settings here.

成功部署图表后,可以在 Azure 门户中通过适用于容器的 Azure Monitor 查看混合 Kubernetes 群集的数据。After you have successfully deployed the chart, you can review the data for your hybrid Kubernetes cluster in Azure Monitor for containers from the Azure portal.

备注

从代理收集数据到在 Azure Log Analytics 工作区中提交数据,引入延迟大约为 5 到 10 分钟。Ingestion latency is around five to ten minutes from agent to commit in the Azure Log Analytics workspace. 群集的状态将显示值“无数据”或“未知”,直到所有必需的监视数据在 Azure Monitor 中可用。Status of the cluster show the value No data or Unknown until all the required monitoring data is available in Azure Monitor.

配置代理终结点Configure proxy endpoint

从图表版本 2.7.1 开始,图表将支持使用 omsagent.proxy 图表参数来指定代理终结点。Starting with chart version 2.7.1, chart will support specifying the proxy endpoint with the omsagent.proxy chart parameter. 这样,图表即可通过代理服务器进行通信。This allows it to communicate through your proxy server. 适用于容器的 Azure Monitor 代理与 Azure Monitor 之间的通信可以通过 HTTP 或 HTTPS 代理服务器进行,并且支持匿名身份验证和基本身份验证(用户名/密码)。Communication between the Azure Monitor for containers agent and Azure Monitor can be an HTTP or HTTPS proxy server, and both anonymous and basic authentication (username/password) are supported.

代理配置值具有以下语法:[protocol://][user:password@]proxyhost[:port]The proxy configuration value has the following syntax: [protocol://][user:password@]proxyhost[:port]

备注

如果代理服务器不需要身份验证,那么你仍需指定伪用户名/密码。If your proxy server does not require authentication, you still need to specify a psuedo username/password. 这可以是任何用户名或密码。This can be any username or password.

属性Property 说明Description
协议Protocol http 或 httpshttp or https
useruser 用于代理身份验证的可选用户名Optional username for proxy authentication
passwordpassword 用于代理身份验证的可选密码Optional password for proxy authentication
proxyhostproxyhost 代理服务器的地址或 FQDNAddress or FQDN of the proxy server
portport 代理服务器的可选端口号Optional port number for the proxy server

例如: omsagent.proxy=http://user01:password@proxy01.contoso.com:8080For example: omsagent.proxy=http://user01:password@proxy01.contoso.com:8080

如果将协议指定为“http”,则使用 SSL/TLS 安全连接创建 HTTP 请求。If you specify the protocol as http, the HTTP requests are created using SSL/TLS secure connection. 代理服务器必须支持 SSL/TLS 协议。Your proxy server must support SSL/TLS protocols.

故障排除Troubleshooting

如果尝试为混合 Kubernetes 群集启用监视功能时遇到错误,请复制 PowerShell 脚本 TroubleshootError_nonAzureK8s.ps1,并将其保存到计算机上的某个文件夹中。If you encounter an error while attempting to enable monitoring for your hybrid Kubernetes cluster, copy the PowerShell script TroubleshootError_nonAzureK8s.ps1 and save it to a folder on your computer. 提供此脚本是为了帮助你检测和解决遇到的问题。This script is provided to help detect and fix the issues encountered. 它可检测和尝试更正的问题如下所述:The issues it is designed to detect and attempt correction of are the following:

  • 指定的 Log Analytics 工作区有效The specified Log Analytics workspace is valid
  • 为 Log Analytics 工作区配置了适用于容器的 Azure Monitor 解决方案。The Log Analytics workspace is configured with the Azure Monitor for Containers solution. 如果没有,请配置工作区。If not, configure the workspace.
  • OmsAgent replicaset Pod 正在运行OmsAgent replicaset pods are running
  • OmsAgent daemonset Pod 正在运行OmsAgent daemonset pods are running
  • OmsAgent 运行状况服务正在运行OmsAgent Health service is running
  • 在容器化代理上配置的 Log Analytics 工作区 ID 和密钥与为见解配置的工作区匹配。The Log Analytics workspace ID and key configured on the containerized agent match with the workspace the Insight is configured with.
  • 验证所有 Linux 工作器节点是否都通过 kubernetes.io/role=agent 标签来调度 rs Pod。Validate all the Linux worker nodes have kubernetes.io/role=agent label to schedule rs pod. 如果它不存在,请添加它。If it doesn't exist, add it.
  • 验证是否已在群集中的所有节点上打开了 cAdvisor secure port:10250unsecure port: 10255Validate cAdvisor secure port:10250 or unsecure port: 10255 is opened on all nodes in the cluster.

若要通过 Azure PowerShell 执行,请在包含脚本的文件夹中使用以下命令:To execute with Azure PowerShell, use the following commands in the folder that contains the script:

.\TroubleshootError_nonAzureK8s.ps1 - azureLogAnalyticsWorkspaceResourceId </subscriptions/<subscriptionId>/resourceGroups/<resourcegroupName>/providers/Microsoft.OperationalInsights/workspaces/<workspaceName> -kubeConfig <kubeConfigFile> -clusterContextInKubeconfig <clusterContext>

后续步骤Next steps

启用监视功能以收集混合 Kubernetes 群集及其上运行的工作负荷的运行状况和资源利用率后,请了解如何使用用于容器的 Azure Monitor。With monitoring enabled to collect health and resource utilization of your hybrid Kubernetes cluster and workloads running on them, learn how to use Azure Monitor for containers.