了解容器见解的监视成本Understand monitoring costs for Container insights

本文提供容器见解的定价指导,帮助你了解以下内容:This article provides pricing guidance for Container insights to help you understand the following:

  • 启用此见解之前,如何预先估算成本How to estimate costs up-front before you enable this Insight

  • 为一个或多个容器启用容器见解之后,如何度量成本How to measure costs after Container insights has been enabled for one or more containers

  • 如何控制数据收集并降低成本How to control the collection of data and make cost reductions

Azure Monitor 日志收集、索引和存储 Kubernetes 群集生成的数据。Azure Monitor Logs collects, indexes, and stores data generated by your Kubernetes cluster.

Azure Monitor 定价模型主要基于 Log Analytics 工作区中每天引入的数据量(以 GB 为单位)。The Azure Monitor pricing model is primarily based on the amount of data ingested in gigabytes per day into your Log Analytics workspace. Log Analytics 工作区的成本不仅基于收集的数据量,还取决于所选的计划,以及群集生成的数据的存储时间长短。The cost of a Log Analytics workspace isn't based only on the volume of data collected, it is also dependent on the plan selected, and how long you chose to store data generated from your clusters.

备注

所有规模和定价仅适用于示例估算。All sizes and pricing are for sample estimation only. 有关基于 Azure Monitor Log Analytics 定价模型和 Azure 区域的最新定价,请参阅 Azure Monitor 定价页。Please refer to the Azure Monitor pricing page for the most recent pricing based on your Azure Monitor Log Analytics pricing model and Azure region.

下面汇总了使用容器见解从 Kubernetes 群集收集的数据类型,这些数据类型会影响成本,你可以根据使用情况来自定义它们:The following is a summary of what types of data are collected from a Kubernetes cluster with Container insights that influences cost and can be customized based on your usage:

  • 来自群集中每个 Kubernetes 命名空间中每个受监视容器的 Stdout、stderr 容器日志Stdout, stderr container logs from every monitored container in every Kubernetes namespace in the cluster

  • 来自群集中每个受监视容器的容器环境变量Container environment variables from every monitored container in the cluster

  • 群集中不需要监视的已完成 Kubernetes 作业/PodCompleted Kubernetes jobs/pods in the cluster that does not require monitoring

  • Prometheus 指标的主动抓取Active scraping of Prometheus metrics

  • AKS 群集中 Kubernetes 主节点日志的诊断日志收集,用于分析由主组件(例如 kube-apiserver 和 kube-controller-manager)生成的日志数据。Diagnostic log collection of Kubernetes master node logs in your AKS cluster to analyze log data generated by master components such as the kube-apiserver and kube-controller-manager.

从 Kubernetes 群集收集的内容What is collected from Kubernetes clusters

容器见解包含收集的一组预定义指标和清单项,这些指标和清单项将作为日志数据写入到 Log Analytics 工作区中。Container insights includes a predefined set of metrics and inventory items collected that are written as log data in your Log Analytics workspace. 默认情况下,下面列出的所有指标每隔一分钟收集一次。All metrics listed below are collected by default every one minute.

收集的节点指标Node metrics collected

下面列出了为每个节点收集的 24 个指标:The following list is the 24 metrics per node that are collected:

  • cpuUsageNanoCorescpuUsageNanoCores
  • cpuCapacityNanoCorescpuCapacityNanoCores
  • cpuAllocatableNanoCorescpuAllocatableNanoCores
  • memoryRssBytesmemoryRssBytes
  • memoryWorkingSetBytesmemoryWorkingSetBytes
  • memoryCapacityBytesmemoryCapacityBytes
  • memoryAllocatableBytesmemoryAllocatableBytes
  • restartTimeEpochrestartTimeEpoch
  • used (disk)used (disk)
  • free (disk)free (disk)
  • used_percent (disk)used_percent (disk)
  • io_time (diskio)io_time (diskio)
  • writes (diskio)writes (diskio)
  • reads (diskio)reads (diskio)
  • write_bytes (diskio)write_bytes (diskio)
  • write_time (diskio)write_time (diskio)
  • iops_in_progress (diskio)iops_in_progress (diskio)
  • read_bytes (diskio)read_bytes (diskio)
  • read_time (diskio)read_time (diskio)
  • err_in (net)err_in (net)
  • err_out (net)err_out (net)
  • bytes_recv (net)bytes_recv (net)
  • bytes_sent (net)bytes_sent (net)
  • Kubelet_docker_operations (kubelet)Kubelet_docker_operations (kubelet)

容器指标Container metrics

下面列出了为每个容器收集的 8 个指标:The following list is the eight metrics per container collected:

  • cpuUsageNanoCorescpuUsageNanoCores
  • cpuRequestNanoCorescpuRequestNanoCores
  • cpuLimitNanoCorescpuLimitNanoCores
  • memoryRssBytesmemoryRssBytes
  • memoryWorkingSetBytesmemoryWorkingSetBytes
  • memoryRequestBytesmemoryRequestBytes
  • memoryLimitBytesmemoryLimitBytes
  • restartTimeEpochrestartTimeEpoch

群集清单Cluster inventory

下面列出了默认情况下收集的群集清单数据:The following list is the cluster inventory data collected by default:

  • KubePodInventory - 每个容器每分钟一次KubePodInventory - 1 per minute per container
  • KubeNodeInventory - 每个节点每分钟一次KubeNodeInventory - 1 per node per minute
  • KubeServices - 每个服务每分钟一次KubeServices - 1 per service per minute
  • ContainerInventory - 每个容器每分钟一次ContainerInventory - 1 per container per minute

估算监视 AKS 群集的成本Estimating costs to monitor your AKS cluster

下面的估算基于具有以下规模示例的 Azure Kubernetes 服务 (AKS) 群集。The estimation below is based on an Azure Kubernetes Service (AKS) cluster with the following sizing example. 而且,该估算仅适用于收集的指标和清单数据。Also, the estimate applies only for metrics and inventory data collected. 对于容器日志(stdout、stderr 和环境变量),它根据工作负荷生成的日志大小而变化,因此我们将其排除在估算之外。For container logs (stdout, stderr, and environmental variables), it varies based on the log sizes generated by the workload, and they are excluded from our estimation.

如果对按以下方式配置的 AKS 群集启用了监视,If you enabled monitoring of an AKS cluster configured as follows,

  • 三个节点Three nodes
  • 每个节点两个磁盘Two disks per node
  • 每个节点一个网络接口One network interface per node
  • 20 个 Pod(每个 Pod 一个容器 = 总共 20 个容器)20 pods (one container in each pod = 20 containers in total)
  • 两个 Kubernetes 命名空间Two Kubernetes namespaces
  • 五个 Kubernetes 服务(包括 kube-system Pod、服务和命名空间)Five Kubernetes services (includes kube-system pods, services, and namespace)
  • 收集频率 = 60 秒(默认值)Collection frequency = 60 secs (default)

你可以在分配的 Log Analytics 工作区中查看以下表和每小时生成的数据量。You can see the tables and volume of data generated per hour in the assigned Log Analytics workspace. 有关其中每个表的详细信息,请参阅容器记录For more information about each of these tables, see Container records.

Table 规模估算(MB/小时)Size estimate (MB/hour)
性能Perf 12.912.9
InsightsMetricsInsightsMetrics 11.311.3
KubePodInventoryKubePodInventory 1.51.5
KubeNodeInventoryKubeNodeInventory 0.750.75
KubeServicesKubeServices 0.130.13
ContainerInventoryContainerInventory 3.63.6
KubeHealthKubeHealth 0.10.1
KubeMonAgentEventsKubeMonAgentEvents 0.0050.005

总计 = 31 MB/小时= 23.1 GB/月(一个月 = 31 天)Total = 31 MB/Hour = 23.1 GB/month (one month = 31 days)

通过使用 Log Analytics 的默认定价(即用即付模型),可以估算每个月的 Azure Monitor 成本。Using the default pricing for Log Analytics, which is a Pay-As-You-Go model, you can estimate the Azure Monitor cost per month. 加上产能预留后,根据所选预留,每个月的价格会更高。After including a capacity reservation, the price would be higher per month depending on the reservation selected.

控制引入以降低成本Controlling ingestion to reduce cost

假设组织的不同业务部门共享 Kubernetes 基础结构和 Log Analytics 工作区。Consider a scenario where your organization's different business unit shares Kubernetes infrastructure and a Log Analytics workspace. 各业务部门由 Kubernetes 命名空间分隔。With each business unit separated by a Kubernetes namespace. 可以使用“数据使用情况” runbook(可从“查看工作簿”下拉菜单中获得)来可视化每个工作区中引入的数据量。You can visualize how much data is ingested in each workspace using the Data Usage runbook which is available from the View Workbooks dropdown.

“查看工作簿”下拉列表View workbooks dropdown

此工作簿可帮助你直观显示数据源,而不必根据我们在文档中分享的内容构建自己的查询库。This workbook helps you to visualize the source of your data without having to build your own library of queries from what we share in our documentation. 此工作簿包含一些图表,通过这些图表,你可以从以下角度查看计费数据:In this workbook, there are charts with which you can view billable data from such perspectives as:

  • 按“解决方案”查看引入的总计费数据Total billable data ingested in GB by solution
  • 按“容器日志(应用程序日志)”查看引入的计费数据Billable data ingested by Container logs(application logs)
  • 按“Kubernetes 命名空间”查看引入的计费容器日志数据Billable container logs data ingested per by Kubernetes namespace
  • 按“群集名称”查看引入并分隔的计费容器日志数据Billable container logs data ingested segregated by Cluster name
  • 按“日志源条目”查看引入的计费容器日志数据Billable container log data ingested by logsource entry
  • 按“诊断主节点日志”查看引入的计费诊断数据Billable diagnostic data ingested by diagnostic master node logs

数据使用情况工作簿Data usage workbook

若要了解如何管理工作簿的权限,请查看访问控制To learn about managing rights and permissions to the workbook, review Access control.

完成分析,确定哪个或哪些源生成的数据最多或超出要求后,可以重新配置数据收集。After completing your analysis to determine which source or sources are generating the most data or more data that are exceeding your requirements, you can reconfigure data collection. 配置代理数据收集设置一文中介绍了有关配置 stdout、stderr 和环境变量收集的详细信息。Details on configuring collection of stdout, stderr, and environmental variables is described in the Configure agent data collection settings article.

以下示例说明了为帮助控制成本,可通过修改 ConfigMap 文件对群集应用哪些更改。The following are examples of what changes you can apply to your cluster by modifying the ConfigMap file to help control cost.

  1. 通过修改 ConfigMap 文件中的以下内容,对群集中的所有命名空间禁用 stdout 日志:Disable stdout logs across all namespaces in the cluster by modifying the following in the ConfigMap file:

    [log_collection_settings]       
       [log_collection_settings.stdout]          
          enabled = false
    
  2. 通过修改 ConfigMap 文件中的以下内容,禁止从开发命名空间(例如 dev-test)收集 stderr 日志,并继续从其他命名空间(例如 prod 和 default)收集 stderr 日志:Disable collecting stderr logs from your development namespace (for example, dev-test), and continue collecting stderr logs from other namespaces (for example, prod and default) by modifying the following in the ConfigMap file:

    备注

    kube-system 日志收集默认情况下处于禁用状态。The kube-system log collection is disabled by default. 系统将保留该默认设置,并向 stderr 日志收集应用将 dev-test 命名空间添加到排除命名空间列表的操作。The default setting is retained, adding dev-test namespace to the list of exclusion namespaces is applied to stderr log collection.

    [log_collection_settings.stderr]          
       enabled = true          
          exclude_namespaces = ["kube-system", "dev-test"]
    
  3. 通过修改 ConfigMap 文件中的以下内容,禁用整个群集中的环境变量收集。Disable environment variable collection across the cluster by modifying the following in the ConfigMap file. 这适用于每个 Kubernetes 命名空间中的所有容器。This is applicable to all containers in every Kubernetes namespace.

    [log_collection_settings.env_var]
        enabled = false
    
  4. 若要清理已完成的作业,请通过修改 ConfigMap 文件中的以下内容,在作业定义中指定清理策略:To clean up completed jobs, specify the cleanup policy in the job definition by modifying the following in the ConfigMap file:

    apiVersion: batch/v1
    kind: Job
    metadata:
      name: pi-with-ttl
    spec:
      ttlSecondsAfterFinished: 100
    

将这些更改中的一项或多项应用于 ConfigMap 之后,请参阅应用已更新的 ConfigMap,将其应用于群集。After applying one or more of these changes to your ConfigMaps, see Applying updated ConfigMap to apply it to your cluster.

Prometheus 指标抓取Prometheus metrics scraping

如果使用 Prometheus 指标抓取,请确保考虑以下因素,以限制从群集收集的指标数:If you are utilizing Prometheus metric scraping, ensure you consider the following to limit the number of metrics that you collect from your cluster:

  • 确保以最佳方式设置抓取频率(默认值为 60 秒)。Ensure scraping frequency is set optimally (the default is 60 seconds). 虽然可以将频率提高到 15 秒,但需要确保抓取的指标以该频率发布。While you can increase the frequency to 15 seconds, you need to ensure that the metrics you are scraping are published at that frequency. 否则,系统会抓取许多重复指标并每隔一段时间发送到 Log Analytics 工作区,这样不仅会增加数据引入和保留成本,用处也不大。Otherwise there will be many duplicate metrics scraped and sent to your Log Analytics workspace at intervals adding to data ingestion and retention costs, but are of less value.

  • 容器见解支持按指标名称分类的排除列表和包含列表。Container insights supports exclusion & inclusion lists by metric name. 例如,如果要在群集中抓取 kubedns 指标,默认情况下可能会有数百个指标被抓取,但你很可能只对某个子集感兴趣。For example, if you are scraping kubedns metrics in your cluster, there might be hundreds of them that gets scraped by default, but you are most likely only interested in a subset. 确认你指定了要抓取的指标列表,或者除了少数指标外,排除其他指标以节省数据引入量。Confirm you specified a list of metrics to scrape, or exclude others except a few to save on data ingestion volume. 启用抓取但不使用其中许多指标(它们只会给 Log Analytics 帐单增加额外的费用),这一点很容易做到。It is easy to enable scraping and not use many of those metrics, which will only add additional charges to your Log Analytics bill.

  • 抓取 Pod 注释时,确保按命名空间进行筛选,以便从不使用的命名空间(例如 dev-test 命名空间)中排除 Pod 指标抓取。When scraping through pod annotations, ensure you filter by namespace so that you exclude scraping of pod metrics from namespaces that you don't use (for example, dev-test namespace).

后续步骤Next steps

若要详细了解如何根据数据(通过容器见解收集)中的最新使用模式来了解可能产生的成本,请参阅管理使用情况并估算成本For more information about how to understand what the costs are likely to be based on recent usage patterns from data collected with Container insights, see Manage your usage and estimate costs.