Azure Kubernetes 服务 (AKS) 中的应用程序缩放选项Scaling options for applications in Azure Kubernetes Service (AKS)

在 Azure Kubernetes 服务 (AKS) 中运行应用程序时,可能需要增加或减少计算资源量。As you run applications in Azure Kubernetes Service (AKS), you may need to increase or decrease the amount of compute resources. 随着所需应用程序实例数量的变化,可能还需要更改基础 Kubernetes 节点的数量。As the number of application instances you need change, the number of underlying Kubernetes nodes may also need to change. 可能还需要快速预配大量其他应用程序实例。You also might need to quickly provision a large number of additional application instances.

本文介绍有助于在 AKS 中缩放应用程序的核心概念:This article introduces the core concepts that help you scale applications in AKS:

手动缩放 Pod 或节点Manually scale pods or nodes

可以手动缩放副本 (Pod) 和节点,以测试应用程序如何响应可用资源和状态的更改。You can manually scale replicas (pods) and nodes to test how your application responds to a change in available resources and state. 手动缩放资源还可以定义用于维持固定成本的设定数量的资源,例如节点数。Manually scaling resources also lets you define a set amount of resources to use to maintain a fixed cost, such as the number of nodes. 若要手动缩放,请定义副本或节点计数。To manually scale, you define the replica or node count. 然后,Kubernetes API 根据该副本或节点计数计划创建其他 Pod 或排空节点。The Kubernetes API then schedules creating additional pods or draining nodes based on that replica or node count.

若要开始使用手动缩放 Pod 和节点,请参阅在 AKS 中缩放应用程序To get started with manually scaling pods and nodes see Scale applications in AKS.

水平 Pod 自动缩放程序Horizontal pod autoscaler

Kubernetes 使用水平 Pod 自动缩放程序 (HPA) 来监视资源需求并自动缩放副本数量。Kubernetes uses the horizontal pod autoscaler (HPA) to monitor the resource demand and automatically scale the number of replicas. 默认情况下,水平 Pod 自动缩放程序每隔 30 秒检查一次指标 API,以了解副本计数所需的任何更改。By default, the horizontal pod autoscaler checks the Metrics API every 30 seconds for any required changes in replica count. 当需要进行更改时,副本的数量会相应增加或减少。When changes are required, the number of replicas is increased or decreased accordingly. 水平 Pod 自动缩放程序与已经为 Kubernetes 1.8+ 部署了指标服务器的 AKS 群集配合使用。Horizontal pod autoscaler works with AKS clusters that have deployed the Metrics Server for Kubernetes 1.8+.

Kubernetes 水平 Pod 自动缩放

为给定部署配置水平 Pod 自动缩放程序时,请定义可运行的最小和最大副本数。When you configure the horizontal pod autoscaler for a given deployment, you define the minimum and maximum number of replicas that can run. 还可以定义指标以监视任何缩放决策并以此为依据,例如 CPU 使用情况。You also define the metric to monitor and base any scaling decisions on, such as CPU usage.

若要开始使用 AKS 中的水平 Pod 自动缩放程序,请参阅在 AKS 中自动缩放 PodTo get started with the horizontal pod autoscaler in AKS, see Autoscale pods in AKS.

缩放事件的冷却时间Cooldown of scaling events

由于水平 Pod 自动缩放程序每 30 秒检查一次指标 API,因此在进行另一次检查之前,先前的缩放事件可能尚未成功完成。As the horizontal pod autoscaler checks the Metrics API every 30 seconds, previous scale events may not have successfully completed before another check is made. 此行为可能导致水平 Pod 自动缩放程序会在上一个缩放事件能够接收应用程序工作负荷且需要对资源进行相应调整之前更改副本数。This behavior could cause the horizontal pod autoscaler to change the number of replicas before the previous scale event could receive application workload and the resource demands to adjust accordingly.

为最大限度地减少这些争用事件,可以设置冷却时间值或延迟值。To minimize these race events, cooldown or delay values are set. 这些值定义水平 Pod 自动缩放程序在执行一个缩放事件之后,触发另一个缩放事件之前必须等待的时间。These values define how long the horizontal pod autoscaler must wait after a scale event before another scale event can be triggered. 此行为允许新副本计数生效,指标 API 反映分布式工作负荷。This behavior allows the new replica count to take effect and the Metrics API to reflect the distributed workload. 默认情况下,纵向扩展事件的延迟为 3 分钟,纵向缩减事件的延迟为 5 分钟By default, the delay on scale up events is 3 minutes, and the delay on scale down events is 5 minutes

目前,无法从默认值调整这些冷却时间值。Currently, you can't tune these cooldown values from the default.

纵向扩展事件Scale up events

如果节点没有足够的计算资源来运行请求的 Pod,则该 Pod 无法按照计划继续运行。If a node doesn't have sufficient compute resources to run a requested pod, that pod can't progress through the scheduling process. 除非节点池中有其他可用的计算资源,否则无法启动该 Pod。The pod can't start unless additional compute resources are available within the node pool.

当群集自动缩放程序通知由于节点池资源限制而无法将 Pod 列入计划时,节点池中的节点数量会增加,提供额外的计算资源。When the cluster autoscaler notices pods that can't be scheduled because of node pool resource constraints, the number of nodes within the node pool is increased to provide the additional compute resources. 当这些额外的节点成功部署并可在节点池中使用时,可将 Pod 计划为运行。When those additional nodes are successfully deployed and available for use within the node pool, the pods are then scheduled to run on them.

如果应用程序需要快速缩放,则某些 Pod 可能会保持等待计划的状态,直到群集自动缩放程序部署的其他节点可以接受列入计划的 Pod。If your application needs to scale rapidly, some pods may remain in a state waiting to be scheduled until the additional nodes deployed by the cluster autoscaler can accept the scheduled pods. 对于具有高突发需求的应用程序,可以使用虚拟节点和 Azure 容器实例进行缩放。For applications that have high burst demands, you can scale with virtual nodes and Azure Container Instances.

纵向缩减事件Scale down events

群集自动缩放程序还会监视最近未收到新计划请求的节点的 Pod 计划状态。The cluster autoscaler also monitors the pod scheduling status for nodes that haven't recently received new scheduling requests. 此方案表明节点池具有的计算资源多于所需资源,并且可以减少节点数。This scenario indicates the node pool has more compute resources than are required, and the number of nodes can be decreased.

默认情况下,计划删除传递超过 10 分钟不再需要的阈值的节点。A node that passes a threshold for no longer being needed for 10 minutes by default is scheduled for deletion. 发生这种情况时,会计划 Pod 在节点池中的其他节点上运行,并且群集自动缩放程序会减少节点数。When this situation occurs, pods are scheduled to run on other nodes within the node pool, and the cluster autoscaler decreases the number of nodes.

当群集自动缩放程序减少节点数时,由于在不同节点上计划 Pod,应用程序可能会发生一些中断。Your applications may experience some disruption as pods are scheduled on different nodes when the cluster autoscaler decreases the number of nodes. 为最大限度地减少中断,请避免使用单个 Pod 实例的应用程序。To minimize disruption, avoid applications that use a single pod instance.

后续步骤Next steps

若要开始缩放应用程序,请首先按照使用 Azure CLI 创建 AKS 群集的快速入门进行操作。To get started with scaling applications, first follow the quickstart to create an AKS cluster with the Azure CLI. 然后,可以开始手动或自动缩放 AKS 群集中的应用程序:You can then start to manually or automatically scale applications in your AKS cluster:

有关核心 Kubernetes 和 AKS 概念的详细信息,请参阅以下文章:For more information on core Kubernetes and AKS concepts, see the following articles: