为 Azure Kubernetes 服务 (AKS) 中的群集创建和管理多个节点池Create and manage multiple node pools for a cluster in Azure Kubernetes Service (AKS)

在 Azure Kubernetes 服务 (AKS) 中,采用相同配置的节点分组成节点池。In Azure Kubernetes Service (AKS), nodes of the same configuration are grouped together into node pools. 这些节点池包含运行应用程序的底层 VM。These node pools contain the underlying VMs that run your applications. 节点的初始数量和大小 (SKU) 是创建 AKS 群集时(此时会创建系统节点池)定义的。The initial number of nodes and their size (SKU) is defined when you create an AKS cluster, which creates a system node pool. 若要支持具有不同计算或存储需求的应用程序,可以创建额外的用户节点池。To support applications that have different compute or storage demands, you can create additional user node pools. 系统节点池主要用于托管关键系统 Pod(例如 CoreDNS 和 tunnelfront)。System node pools serve the primary purpose of hosting critical system pods such as CoreDNS and tunnelfront. 用户节点池主要用于托管应用程序 Pod。User node pools serve the primary purpose of hosting your application pods. 但是,如果希望在 AKS 群集中只有一个池,可以在系统节点池上计划应用程序 Pod。However, application pods can be scheduled on system node pools if you wish to only have one pool in your AKS cluster. 用户节点池用于放置应用程序特定的 Pod。User node pools are where you place your application-specific pods. 例如,使用这些额外的用户节点池可为计算密集型应用程序提供 GPU,或者访问高性能 SSD 存储。For example, use these additional user node pools to provide GPUs for compute-intensive applications, or access to high-performance SSD storage.

备注

利用此功能可以更好地控制如何创建和管理多个节点池。This feature enables higher control over how to create and manage multiple node pools. 因此,需要使用单独的命令来执行创建/更新/删除操作。As a result, separate commands are required for create/update/delete. 以前,通过 az aks createaz aks update 执行的群集操作使用 managedCluster API,并且只能通过这些操作更改控制平面和单个节点池。Previously cluster operations through az aks create or az aks update used the managedCluster API and were the only option to change your control plane and a single node pool. 此功能通过 agentPool API 为代理池公开单独的操作集,并要求使用 az aks nodepool 命令集对单个节点池执行操作。This feature exposes a separate operation set for agent pools through the agentPool API and require use of the az aks nodepool command set to execute operations on an individual node pool.

本文介绍如何在 AKS 群集中创建和管理多个节点池。This article shows you how to create and manage multiple node pools in an AKS cluster.

准备阶段Before you begin

需要安装并配置 Azure CLI 2.2.0 或更高版本。You need the Azure CLI version 2.2.0 or later installed and configured. 运行 az --version 即可查找版本。Run az --version to find the version. 如果需要进行安装或升级,请参阅安装 Azure CLIIf you need to install or upgrade, see Install Azure CLI.

限制Limitations

创建和管理支持多个节点池的 AKS 群集时存在以下限制:The following limitations apply when you create and manage AKS clusters that support multiple node pools:

  • 请参阅 Azure Kubernetes 服务 (AKS) 中可用的配额、虚拟机大小限制和区域See Quotas, virtual machine size restrictions, and region availability in Azure Kubernetes Service (AKS).
  • 可以删除系统节点池,前提是有另一个可在 AKS 群集中取代它的系统节点池。You can delete system node pools, provided you have another system node pool to take its place in the AKS cluster.
  • 系统池必须至少包含一个节点,而用户节点池则可能包含零个或零个以上的节点。System pools must contain at least one node, and user node pools may contain zero or more nodes.
  • AKS 群集必须通过标准 SKU 负载均衡器来使用多个节点池,而基本 SKU 负载均衡器并不支持该功能。The AKS cluster must use the Standard SKU load balancer to use multiple node pools, the feature is not supported with Basic SKU load balancers.
  • AKS 群集必须对节点使用虚拟机规模集。The AKS cluster must use virtual machine scale sets for the nodes.
  • 节点池的名称只能包含小写字母数字字符,且必须以小写字母开头。The name of a node pool may only contain lowercase alphanumeric characters and must begin with a lowercase letter. Linux 节点池的名称长度必须为 1 到 12 个字符;Windows 节点池的名称长度必须为 1 到 6 个字符。For Linux node pools the length must be between 1 and 12 characters, for Windows node pools the length must be between 1 and 6 characters.
  • 所有节点池都必须位于同一虚拟网络中。All node pools must reside in the same virtual network.
  • 在创建群集的过程中创建多个节点池时,节点池使用的所有 Kubernetes 版本都必须与已为控制平面设置的版本相匹配。When creating multiple node pools at cluster create time, all Kubernetes versions used by node pools must match the version set for the control plane. 这可以在已使用每节点池操作预配了群集后更新。This can be updated after the cluster has been provisioned by using per node pool operations.

创建 AKS 群集Create an AKS cluster

重要

如果在生产环境中为 AKS 群集运行单个系统节点池,则建议至少将三个节点用作节点池。If you run a single system node pool for your AKS cluster in a production environment, we recommend you use at least three nodes for the node pool.

若要开始,请创建包含单个节点池的 AKS 群集。To get started, create an AKS cluster with a single node pool. 以下示例使用 az group create 命令在 chinaeast2 区域中创建名为 myResourceGroup 的资源组。The following example uses the az group create command to create a resource group named myResourceGroup in the chinaeast2 region. 然后使用 az AKS create 命令创建名为 myAKSCluster 的 AKS 群集。An AKS cluster named myAKSCluster is then created using the az aks create command.

备注

使用多个节点池时,不支持“基本”负载均衡器 SKU。The Basic load balancer SKU is not supported when using multiple node pools. 默认情况下,AKS 群集是在 Azure CLI 和 Azure 门户中使用“标准”负载均衡器 SKU 创建的。By default, AKS clusters are created with the Standard load balancer SKU from the Azure CLI and Azure portal.

# Create a resource group in China East 2
az group create --name myResourceGroup --location chinaeast2

# Create a basic single-node AKS cluster
az aks create \
    --resource-group myResourceGroup \
    --name myAKSCluster \
    --vm-set-type VirtualMachineScaleSets \
    --node-count 2 \
    --generate-ssh-keys \
    --load-balancer-sku standard

创建群集需要几分钟时间。It takes a few minutes to create the cluster.

备注

为了确保群集可靠运行,应在默认节点池中至少运行 2(两)个节点,因为基本系统服务将在整个节点池中运行。To ensure your cluster operates reliably, you should run at least 2 (two) nodes in the default node pool, as essential system services are running across this node pool.

群集准备就绪后,请使用 az aks get-credentials 命令获取要与 kubectl 结合使用的群集凭据:When the cluster is ready, use the az aks get-credentials command to get the cluster credentials for use with kubectl:

az aks get-credentials --resource-group myResourceGroup --name myAKSCluster

添加节点池Add a node pool

在上一步骤中创建的群集包含单个节点池。The cluster created in the previous step has a single node pool. 让我们使用 az aks nodepool add 命令添加另一个节点池。Let's add a second node pool using the az aks nodepool add command. 以下示例创建名为 mynodepool 的节点池,该节点池运行 3 个节点:The following example creates a node pool named mynodepool that runs 3 nodes:

az aks nodepool add \
    --resource-group myResourceGroup \
    --cluster-name myAKSCluster \
    --name mynodepool \
    --node-count 3

备注

节点池的名称必须以小写字母开头,且只能包含字母数字字符。The name of a node pool must start with a lowercase letter and can only contain alphanumeric characters. Linux 节点池的名称长度必须为 1 到 12 个字符;Windows 节点池的名称长度必须为 1 到 6 个字符。For Linux node pools the length must be between 1 and 12 characters, for Windows node pools the length must be between 1 and 6 characters.

若要查看节点池的状态,请使用 az aks node pool list 命令并指定资源组和群集名称:To see the status of your node pools, use the az aks node pool list command and specify your resource group and cluster name:

az aks nodepool list --resource-group myResourceGroup --cluster-name myAKSCluster

以下示例输出显示已成功地在节点池中创建包含三个节点的 mynodepoolThe following example output shows that mynodepool has been successfully created with three nodes in the node pool. 在上一步骤中创建 AKS 群集时,已创建包含 2 个节点的默认 nodepool1When the AKS cluster was created in the previous step, a default nodepool1 was created with a node count of 2.

[
  {
    ...
    "count": 3,
    ...
    "name": "mynodepool",
    "orchestratorVersion": "1.15.7",
    ...
    "vmSize": "Standard_DS2_v2",
    ...
  },
  {
    ...
    "count": 2,
    ...
    "name": "nodepool1",
    "orchestratorVersion": "1.15.7",
    ...
    "vmSize": "Standard_DS2_v2",
    ...
  }
]

提示

如果在添加节点池时未指定 VmSize ,则 Windows 节点池的默认大小为 Standard_D2s_v3,Linux 节点池的默认大小为 Standard_DS2_v2 。If no VmSize is specified when you add a node pool, the default size is Standard_D2s_v3 for Windows node pools and Standard_DS2_v2 for Linux node pools. 如果未指定 OrchestratorVersion,则它将默认为与控制平面相同的版本。If no OrchestratorVersion is specified, it defaults to the same version as the control plane.

添加一个具有唯一子网的节点池(预览)Add a node pool with a unique subnet (preview)

某个工作负载可能会要求将群集的节点拆分为单独的池以进行逻辑隔离。A workload may require splitting a cluster's nodes into separate pools for logical isolation. 可使用专用于群集中每个节点池的单独子网来支持这种隔离。This isolation can be supported with separate subnets dedicated to each node pool in the cluster. 这可以满足例如在节点池中拆分非连续的虚拟网络地址空间的要求。This can address requirements such as having non-contiguous virtual network address space to split across node pools.

限制Limitations

  • 分配给节点池的所有子网都必须属于同一虚拟网络。All subnets assigned to nodepools must belong to the same virtual network.
  • 系统 Pod 必须有权访问群集中的所有节点以提供关键功能,例如通过 coreDNS 进行 DNS 解析。System pods must have access to all nodes in the cluster to provide critical functionality such as DNS resolution via coreDNS.
  • 在预览版期间,为每个节点池分配唯一子网仅限于 Azure CNI。Assignment of a unique subnet per node pool is limited to Azure CNI during preview.
  • 预览期间不支持将网络策略与每个节点池的唯一子网一起使用。Using network policies with a unique subnet per node pool is not supported during preview.

若要创建具有专用子网的节点池,请在创建节点池时将子网资源 ID 作为附加参数传递。To create a node pool with a dedicated subnet, pass the subnet resource ID as an additional parameter when creating a node pool.

az aks nodepool add \
    --resource-group myResourceGroup \
    --cluster-name myAKSCluster \
    --name mynodepool \
    --node-count 3 \
    --vnet-subnet-id <YOUR_SUBNET_RESOURCE_ID>

升级节点池Upgrade a node pool

备注

不能对群集或节点池同时执行升级和缩放操作,否则会返回错误。Upgrade and scale operations on a cluster or node pool cannot occur simultaneously, if attempted an error is returned. 而只能先在目标资源上完成一个操作类型,然后再在同一资源上执行下一个请求。Instead, each operation type must complete on the target resource prior to the next request on that same resource. 请阅读故障排除指南中的详细信息。Read more about this on our troubleshooting guide.

本部分中的命令说明如何升级单个特定的节点池。The commands in this section explain how to upgrade a single specific node pool. 下一部分将会说明升级控制平面与节点池的 Kubernetes 版本之间的关系。The relationship between upgrading the Kubernetes version of the control plane and the node pool are explained in the section below.

备注

节点池的 OS 映像版本与群集的 Kubernetes 版本相关联。The node pool OS image version is tied to the Kubernetes version of the cluster. 只能通过集群升级来获得OS映像的升级。You will only get OS image upgrades, following a cluster upgrade.

由于本示例包含两个节点池,因此必须使用 az aks nodepool upgrade 来升级节点池。Since there are two node pools in this example, we must use az aks nodepool upgrade to upgrade a node pool. 若要查看可用的升级,请使用 az aks get-upgradesTo see the available upgrades use az aks get-upgrades

az aks get-upgrades --resource-group myResourceGroup --name myAKSCluster

现在升级 mynodepool。Let's upgrade the mynodepool. 如以下示例中所示,使用 az aks nodepool upgrade 命令升级节点池:Use the az aks nodepool upgrade command to upgrade the node pool, as shown in the following example:

az aks nodepool upgrade \
    --resource-group myResourceGroup \
    --cluster-name myAKSCluster \
    --name mynodepool \
    --kubernetes-version KUBERNETES_VERSION \
    --no-wait

使用 az aks node pool list 命令再次列出节点池的状态。List the status of your node pools again using the az aks node pool list command. 以下示例显示“mynodepool”处于“升级”状态,正在升级到 KUBERNETES_VERSION :The following example shows that mynodepool is in the Upgrading state to KUBERNETES_VERSION:

az aks nodepool list -g myResourceGroup --cluster-name myAKSCluster
[
  {
    ...
    "count": 3,
    ...
    "name": "mynodepool",
    "orchestratorVersion": "KUBERNETES_VERSION",
    ...
    "provisioningState": "Upgrading",
    ...
    "vmSize": "Standard_DS2_v2",
    ...
  },
  {
    ...
    "count": 2,
    ...
    "name": "nodepool1",
    "orchestratorVersion": "1.15.7",
    ...
    "provisioningState": "Succeeded",
    ...
    "vmSize": "Standard_DS2_v2",
    ...
  }
]

将节点升级到指定的版本需要花费几分钟时间。It takes a few minutes to upgrade the nodes to the specified version.

最佳做法是将 AKS 群集中的所有节点池升级到相同的 Kubernetes 版本。As a best practice, you should upgrade all node pools in an AKS cluster to the same Kubernetes version. az aks upgrade 的默认行为是将所有节点池连同控制平面一起升级,以实现这种一致性。The default behavior of az aks upgrade is to upgrade all node pools together with the control plane to achieve this alignment. 升级单个节点池的功能可让你执行滚动升级,并在节点池之间计划 pod,使应用程序的正常运行时间保持在上述约束范围内。The ability to upgrade individual node pools lets you perform a rolling upgrade and schedule pods between node pools to maintain application uptime within the above constraints mentioned.

升级包含多个节点池的群集控制平面Upgrade a cluster control plane with multiple node pools

备注

Kubernetes 使用标准的语义化版本控制方案。Kubernetes uses the standard Semantic Versioning versioning scheme. 版本号以 x.y.z 表示,其中,x 是主要版本,y 是次要版本,z 是修补程序版本。The version number is expressed as x.y.z, where x is the major version, y is the minor version, and z is the patch version. 例如,版本 1.12.6 中的 1 是主要版本,12 是次要版本,6 是修补程序版本。For example, in version 1.12.6, 1 is the major version, 12 is the minor version, and 6 is the patch version. 控制平面和初始节点池的 Kubernetes 版本是在群集创建过程中设置的。The Kubernetes version of the control plane and the initial node pool are set during cluster creation. 将所有附加节点池添加到群集时,将为其设置 Kubernetes 版本。All additional node pools have their Kubernetes version set when they are added to the cluster. 不同节点池的 Kubernetes 版本可能不同,节点池与控制平面的 Kubernetes 版本也可能不同。The Kubernetes versions may differ between node pools as well as between a node pool and the control plane.

AKS 群集包含两个具有关联 Kubernetes 版本的群集资源对象。An AKS cluster has two cluster resource objects with Kubernetes versions associated.

  1. 群集控制平面 Kubernetes 版本。A cluster control plane Kubernetes version.
  2. 具有 Kubernetes 版本的节点池。A node pool with a Kubernetes version.

一个控制平面映射到一个或多个节点池。A control plane maps to one or many node pools. 升级操作的行为取决于所用的 Azure CLI 命令。The behavior of an upgrade operation depends on which Azure CLI command is used.

升级 AKS 控制平面需要使用 az aks upgradeUpgrading an AKS control plane requires using az aks upgrade. 此命令将升级群集中的控制平面版本和所有节点池。This command upgrades the control plane version and all node pools in the cluster.

结合 --control-plane-only 标志发出 az aks upgrade 命令只会升级群集控制平面,Issuing the az aks upgrade command with the --control-plane-only flag upgrades only the cluster control plane. 而不会更改群集中任何关联的节点池。None of the associated node pools in the cluster are changed.

升级单个节点池需要使用 az aks nodepool upgradeUpgrading individual node pools requires using az aks nodepool upgrade. 此命令只会升级具有指定 Kubernetes 版本的目标节点池This command upgrades only the target node pool with the specified Kubernetes version

升级验证规则Validation rules for upgrades

群集控制平面和节点池的有效 Kubernetes 升级由以下规则集验证。The valid Kubernetes upgrades for a cluster's control plane and node pools are validated by the following sets of rules.

  • 用于升级节点池的有效版本的规则:Rules for valid versions to upgrade node pools:

    • 节点池版本的主要版本必须与控制平面相同。The node pool version must have the same major version as the control plane.
    • 节点池的次要版本必须在控制平面版本的两个次要版本范围内。 The node pool minor version must be within two minor versions of the control plane version.
    • 节点池版本不能大于控制平面的 major.minor.patch 版本。The node pool version cannot be greater than the control major.minor.patch version.
  • 提交升级操作的规则:Rules for submitting an upgrade operation:

    • 无法降级控制平面或节点池的 Kubernetes 版本。You cannot downgrade the control plane or a node pool Kubernetes version.
    • 如果未指定节点池的 Kubernetes 版本,则行为取决于所用的客户端。If a node pool Kubernetes version is not specified, behavior depends on the client being used. 资源管理器模板中的声明会回退到为节点池定义的现有版本(如果已使用)。如果未设置现有版本,则会使用控制平面版本进行回退。Declaration in Resource Manager templates falls back to the existing version defined for the node pool if used, if none is set the control plane version is used to fall back on.
    • 可以在给定的时间升级或者缩放控制平面或节点池,而不能同时对单个控制平面或节点池资源提交多个操作。You can either upgrade or scale a control plane or a node pool at a given time, you cannot submit multiple operations on a single control plane or node pool resource simultaneously.

手动缩放节点池Scale a node pool manually

当应用程序工作负荷需求发生变化时,你可能需要缩放节点池中的节点数。As your application workload demands change, you may need to scale the number of nodes in a node pool. 可以增加或减少节点数。The number of nodes can be scaled up or down.

若要缩放节点池中的节点数,请使用 az aks node pool scale 命令。To scale the number of nodes in a node pool, use the az aks node pool scale command. 以下示例将 mynodepool 中的节点数缩放为 5The following example scales the number of nodes in mynodepool to 5:

az aks nodepool scale \
    --resource-group myResourceGroup \
    --cluster-name myAKSCluster \
    --name mynodepool \
    --node-count 5 \
    --no-wait

使用 az aks node pool list 命令再次列出节点池的状态。List the status of your node pools again using the az aks node pool list command. 以下示例显示 mynodepool 处于 Scaling(正在缩放)状态,其中的新节点计数为 5The following example shows that mynodepool is in the Scaling state with a new count of 5 nodes:

az aks nodepool list -g myResourceGroup --cluster-name myAKSCluster
[
  {
    ...
    "count": 5,
    ...
    "name": "mynodepool",
    "orchestratorVersion": "1.15.7",
    ...
    "provisioningState": "Scaling",
    ...
    "vmSize": "Standard_DS2_v2",
    ...
  },
  {
    ...
    "count": 2,
    ...
    "name": "nodepool1",
    "orchestratorVersion": "1.15.7",
    ...
    "provisioningState": "Succeeded",
    ...
    "vmSize": "Standard_DS2_v2",
    ...
  }
]

需要花费几分钟时间来完成缩放操作。It takes a few minutes for the scale operation to complete.

通过启用群集自动缩放程序来自动缩放特定节点池Scale a specific node pool automatically by enabling the cluster autoscaler

AKS 提供了一项单独的功能,用于通过一项称为群集自动缩放程序的功能来自动缩放节点池。AKS offers a separate feature to automatically scale node pools with a feature called the cluster autoscaler. 可以为每个节点池启用此功能,每个节点池具有唯一的最小和最大规模计数。This feature can be enabled per node pool with unique minimum and maximum scale counts per node pool. 了解如何对每个节点池使用群集自动缩放程序Learn how to use the cluster autoscaler per node pool.

删除节点池Delete a node pool

不再需要某个池时,可将其删除并删除底层 VM 节点。If you no longer need a pool, you can delete it and remove the underlying VM nodes. 若要删除节点池,请使用 az aks node pool delete 命令并指定节点池名称。To delete a node pool, use the az aks node pool delete command and specify the node pool name. 以下示例删除在前面步骤中创建的 mynoodepoolThe following example deletes the mynoodepool created in the previous steps:

注意

如果删除节点池时发生数据丢失,没有任何选项可以恢复数据。There are no recovery options for data loss that may occur when you delete a node pool. 如果无法在其他节点池中计划 pod,则这些应用程序将不可用。If pods can't be scheduled on other node pools, those applications are unavailable. 当使用中的应用程序没有数据备份或者无法在群集中的其他节点池上运行时,请务必不要删除节点池。Make sure you don't delete a node pool when in-use applications don't have data backups or the ability to run on other node pools in your cluster.

az aks nodepool delete -g myResourceGroup --cluster-name myAKSCluster --name mynodepool --no-wait

az aks node pool list 命令的以下示例输出显示 mynodepool 处于 Deleting(正在删除)状态:The following example output from the az aks node pool list command shows that mynodepool is in the Deleting state:

az aks nodepool list -g myResourceGroup --cluster-name myAKSCluster
[
  {
    ...
    "count": 5,
    ...
    "name": "mynodepool",
    "orchestratorVersion": "1.15.7",
    ...
    "provisioningState": "Deleting",
    ...
    "vmSize": "Standard_DS2_v2",
    ...
  },
  {
    ...
    "count": 2,
    ...
    "name": "nodepool1",
    "orchestratorVersion": "1.15.7",
    ...
    "provisioningState": "Succeeded",
    ...
    "vmSize": "Standard_DS2_v2",
    ...
  }
]

删除节点和节点池需要花费几分钟时间。It takes a few minutes to delete the nodes and the node pool.

指定节点池的 VM 大小Specify a VM size for a node pool

在前面的创建节点池示例中,对群集中创建的节点使用了默认 VM 大小。In the previous examples to create a node pool, a default VM size was used for the nodes created in the cluster. 一个更常见的场景是创建具有不同 VM 大小和功能的节点池。A more common scenario is for you to create node pools with different VM sizes and capabilities. 例如,可以创建一个包含具有大量 CPU 或内存的节点的节点池,或创建一个提供 GPU 支持的节点池。For example, you may create a node pool that contains nodes with large amounts of CPU or memory, or a node pool that provides GPU support. 下一步骤将使用排斥和容许来告知 Kubernetes 计划程序如何将访问权限限制为可在这些节点上运行的 pod。In the next step, you use taints and tolerations to tell the Kubernetes scheduler how to limit access to pods that can run on these nodes.

以下示例创建使用 Standard_NC6s_v3 VM 大小的基于 GPU 的节点池。In the following example, create a GPU-based node pool that uses the Standard_NC6s_v3 VM size. 这些 VM 采用 NVIDIA Tesla K80 卡。These VMs are powered by the NVIDIA Tesla K80 card. 有关可用 VM 大小的信息,请参阅 Azure 中的 Linux 虚拟机大小For information on available VM sizes, see Sizes for Linux virtual machines in Azure.

再次使用 az aks node pool add 命令创建节点池。Create a node pool using the az aks node pool add command again. 这一次请指定名称 gpunodepool,并使用 --node-vm-size 参数指定 Standard_NC6s_v3 大小:This time, specify the name gpunodepool, and use the --node-vm-size parameter to specify the Standard_NC6s_v3 size:

az aks nodepool add \
    --resource-group myResourceGroup \
    --cluster-name myAKSCluster \
    --name gpunodepool \
    --node-count 1 \
    --node-vm-size Standard_NC6s_v3 \
    --no-wait

az aks node pool list 命令的以下示例输出显示 gpunodepool 正在创建具有指定 VmSize 的节点:The following example output from the az aks node pool list command shows that gpunodepool is Creating nodes with the specified VmSize:

az aks nodepool list -g myResourceGroup --cluster-name myAKSCluster
[
  {
    ...
    "count": 1,
    ...
    "name": "gpunodepool",
    "orchestratorVersion": "1.15.7",
    ...
    "provisioningState": "Creating",
    ...
    "vmSize": "Standard_NC6s_v3",
    ...
  },
  {
    ...
    "count": 2,
    ...
    "name": "nodepool1",
    "orchestratorVersion": "1.15.7",
    ...
    "provisioningState": "Succeeded",
    ...
    "vmSize": "Standard_DS2_v2",
    ...
  }
]

成功创建 gpunodepool 需要花费几分钟时间。It takes a few minutes for the gpunodepool to be successfully created.

使用排斥和容许计划 podSchedule pods using taints and tolerations

群集现在包含两个节点池 - 最初创建的默认节点池,以及基于 GPU 的节点池。You now have two node pools in your cluster - the default node pool initially created, and the GPU-based node pool. 使用 kubectl get nodes 命令查看群集中的节点。Use the kubectl get nodes command to view the nodes in your cluster. 以下示例输出显示了节点:The following example output shows the nodes:

kubectl get nodes
NAME                                 STATUS   ROLES   AGE     VERSION
aks-gpunodepool-28993262-vmss000000  Ready    agent   4m22s   v1.15.7
aks-nodepool1-28993262-vmss000000    Ready    agent   115m    v1.15.7

Kubernetes 计划程序能够使用排斥和容许来限制可在节点上运行的工作负荷。The Kubernetes scheduler can use taints and tolerations to restrict what workloads can run on nodes.

  • 排斥应用到指明了只能计划特定 pod 的节点。A taint is applied to a node that indicates only specific pods can be scheduled on them.
  • 然后,将容许应用到可以容许节点排斥的 pod。A toleration is then applied to a pod that allows them to tolerate a node's taint.

有关如何使用 Kubernetes 高级计划功能的详细信息,请参阅有关 AKS 中的高级计划程序功能的最佳做法For more information on how to use advanced Kubernetes scheduled features, see Best practices for advanced scheduler features in AKS

本示例使用 --node-taints 命令向基于 GPU 的节点应用排斥。In this example, apply a taint to your GPU-based node using the --node-taints command. 指定上述 kubectl get nodes 命令的输出中显示的基于 GPU 的节点名称。Specify the name of your GPU-based node from the output of the previous kubectl get nodes command. 排斥以“键=值”对的形式应用,然后作为计划选项应用。The taint is applied as a key=value pair and then a scheduling option. 以下示例使用 sku=gpu 对,并定义具有 NoSchedule 功能的其他 pod:The following example uses the sku=gpu pair and defines pods otherwise have the NoSchedule ability:

az aks nodepool add --node-taints aks-gpunodepool-28993262-vmss000000 sku=gpu:NoSchedule

以下基本示例 YAML 清单使用容许来允许 Kubernetes 计划程序在基于 GPU 的节点上运行 NGINX pod。The following basic example YAML manifest uses a toleration to allow the Kubernetes scheduler to run an NGINX pod on the GPU-based node. 有关针对 MNIST 数据集运行 Tensorflow 作业的更适当但更耗时的示例,请参阅对 AKS 上的计算密集型工作负荷使用 GPUFor a more appropriate, but time-intensive example to run a Tensorflow job against the MNIST dataset, see Use GPUs for compute-intensive workloads on AKS.

创建名为 gpu-toleration.yaml 的文件,并将其复制到以下示例 YAML 中:Create a file named gpu-toleration.yaml and copy in the following example YAML:

apiVersion: v1
kind: Pod
metadata:
  name: mypod
spec:
  containers:
  - image: dockerhub.azk8s.cn/library/nginx:1.15.9
    name: mypod
    resources:
      requests:
        cpu: 100m
        memory: 128Mi
      limits:
        cpu: 1
        memory: 2G
  tolerations:
  - key: "sku"
    operator: "Equal"
    value: "gpu"
    effect: "NoSchedule"

使用 kubectl apply -f gpu-toleration.yaml 命令计划 pod:Schedule the pod using the kubectl apply -f gpu-toleration.yaml command:

kubectl apply -f gpu-toleration.yaml

只需花费几秒钟时间即可计划 pod 并提取 NGINX 映像。It takes a few seconds to schedule the pod and pull the NGINX image. 使用 kubectl describe pod 命令查看 pod 状态。Use the kubectl describe pod command to view the pod status. 以下精简示例输出显示已应用 sku=gpu:NoSchedule 容许。The following condensed example output shows the sku=gpu:NoSchedule toleration is applied. 在 events 节中,计划程序已将 pod 分配到 aks-gpunodepool-28993262-vmss000000 基于 GPU 的节点:In the events section, the scheduler has assigned the pod to the aks-gpunodepool-28993262-vmss000000 GPU-based node:

kubectl describe pod mypod
[...]
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
                 sku=gpu:NoSchedule
Events:
  Type    Reason     Age    From                                          Message
  ----    ------     ----   ----                                          -------
  Normal  Scheduled  4m48s  default-scheduler                             Successfully assigned default/mypod to aks-gpunodepool-28993262-vmss000000
  Normal  Pulling    4m47s  kubelet, aks-gpunodepool-28993262-vmss000000  pulling image "nginx:1.15.9"
  Normal  Pulled     4m43s  kubelet, aks-gpunodepool-28993262-vmss000000  Successfully pulled image "nginx:1.15.9"
  Normal  Created    4m40s  kubelet, aks-gpunodepool-28993262-vmss000000  Created container
  Normal  Started    4m40s  kubelet, aks-gpunodepool-28993262-vmss000000  Started container

只能在 gpunodepool 中的节点上计划已应用此容许的 Pod。Only pods that have this toleration applied can be scheduled on nodes in gpunodepool. 任何其他 pod 将在 nodepool1 节点池中计划。Any other pod would be scheduled in the nodepool1 node pool. 如果创建额外的节点池,可以使用额外的排斥和容许来限制可在这些节点资源上计划的 pod。If you create additional node pools, you can use additional taints and tolerations to limit what pods can be scheduled on those node resources.

指定节点池的排斥、标签或标记Specify a taint, label, or tag for a node pool

创建节点池时,可将排斥、标签或标记添加到该节点池。When creating a node pool, you can add taints, labels, or tags to that node pool. 添加排斥、标签或标记时,该节点池中的所有节点也会获取该排斥、标签或标记。When you add a taint, label, or tag, all nodes within that node pool also get that taint, label, or tag.

若要创建具有排斥的节点池,请使用 az aks nodepool addTo create a node pool with a taint, use az aks nodepool add. 指定名称 taintnp,并使用 --node-taints 参数为排斥指定 sku=gpu:NoSchedule。 Specify the name taintnp and use the --node-taints parameter to specify sku=gpu:NoSchedule for the taint.

az aks nodepool add \
    --resource-group myResourceGroup \
    --cluster-name myAKSCluster \
    --name taintnp \
    --node-count 1 \
    --node-taints sku=gpu:NoSchedule \
    --no-wait

备注

只能在创建节点池期间为节点池设置排斥。A taint can only be set for node pools during node pool creation.

az aks nodepool list 命令的以下示例输出显示 taintnp 正在创建具有指定 nodeTaints 的节点: The following example output from the az aks nodepool list command shows that taintnp is Creating nodes with the specified nodeTaints:

$ az aks nodepool list -g myResourceGroup --cluster-name myAKSCluster

[
  {
    ...
    "count": 1,
    ...
    "name": "taintnp",
    "orchestratorVersion": "1.15.7",
    ...
    "provisioningState": "Creating",
    ...
    "nodeTaints":  [
      "sku=gpu:NoSchedule"
    ],
    ...
  },
 ...
]

排斥信息将显示在 Kubernetes 中,以便于处理节点的计划规则。The taint information is visible in Kubernetes for handling scheduling rules for nodes.

还可以在创建节点池期间向节点池添加标签。You can also add labels to a node pool during node pool creation. 在节点池中设置的标签将添加到节点池中的每个节点。Labels set at the node pool are added to each node in the node pool. 这些标签将显示在 Kubernetes 中,以便于处理节点的计划规则。These labels are visible in Kubernetes for handling scheduling rules for nodes.

若要创建具有标签的节点池,请使用 az aks nodepool addTo create a node pool with a label, use az aks nodepool add. 指定名称 labelnp,并使用 --labels 参数为标签指定 dept=IT 和 costcenter=9999。 Specify the name labelnp and use the --labels parameter to specify dept=IT and costcenter=9999 for labels.

az aks nodepool add \
    --resource-group myResourceGroup \
    --cluster-name myAKSCluster \
    --name labelnp \
    --node-count 1 \
    --labels dept=IT costcenter=9999 \
    --no-wait

备注

只能在创建节点池期间为节点池设置标签。Label can only be set for node pools during node pool creation. 此外,标签必须是键/值对,并采用有效的语法Labels must also be a key/value pair and have a valid syntax.

az aks nodepool list 命令的以下示例输出显示 labelnp 正在创建具有指定 nodeLabels 的节点: The following example output from the az aks nodepool list command shows that labelnp is Creating nodes with the specified nodeLabels:

$ az aks nodepool list -g myResourceGroup --cluster-name myAKSCluster

[
  {
    ...
    "count": 1,
    ...
    "name": "labelnp",
    "orchestratorVersion": "1.15.7",
    ...
    "provisioningState": "Creating",
    ...
    "nodeLabels":  {
      "dept": "IT",
      "costcenter": "9999"
    },
    ...
  },
 ...
]

设置 nodepool Azure 标记Setting nodepool Azure tags

可将 Azure 标记应用到 AKS 群集中的节点池。You can apply an Azure tag to node pools in your AKS cluster. 应用到某个节点池的标记将应用到该节点池中的每个节点,并通过升级持久保存。Tags applied to a node pool are applied to each node within the node pool and are persisted through upgrades. 标记还会应用于在横向扩展操作期间添加到节点池的新节点。Tags are also applied to new nodes added to a node pool during scale-out operations. 添加标记有助于完成策略跟踪或成本估算等任务。Adding a tag can help with tasks such as policy tracking or cost estimation.

进行操作时(例如,通过搜索密钥来检索标记时),Azure 标记可以使用不区分大小写的密钥。Azure tags have keys which are case-insensitive for operations, such as when retrieving a tag by searching the key. 在这种情况下,将更新或检索带有给定密钥的标记,而不考虑大小写。In this case a tag with the given key will be updated or retrieved regardless of casing. 标记值区分大小写。Tag values are case-sensitive.

在 AKS 中,如果设置了多个键相同但大小写不同的标记,则使用的标记是第一个(按字母顺序)。In AKS, if multiple tags are set with identical keys but different casing, the tag used is the first in alphabetical order. 例如,{"Key1": "val1", "kEy1": "val2", "key1": "val3"} 会导致设置 Key1val1For example, {"Key1": "val1", "kEy1": "val2", "key1": "val3"} results in Key1 and val1 being set.

使用 az aks nodepool add 创建节点池。Create a node pool using the az aks nodepool add. 指定名称 tagnodepool,并使用 --tag 参数为标记指定 dept=IT 和 costcenter=9999。 Specify the name tagnodepool and use the --tag parameter to specify dept=IT and costcenter=9999 for tags.

az aks nodepool add \
    --resource-group myResourceGroup \
    --cluster-name myAKSCluster \
    --name tagnodepool \
    --node-count 1 \
    --tags dept=IT costcenter=9999 \
    --no-wait

备注

在使用 az aks nodepool update 命令时以及在创建群集期间,也可以使用 --tags 参数。You can also use the --tags parameter when using az aks nodepool update command as well as during cluster creation. 在创建群集期间,--tags 参数会将标记应用到连同群集一起创建的初始节点池。During cluster creation, the --tags parameter applies the tag to the initial node pool created with the cluster. 所有标记名称必须遵守使用标记来组织 Azure 资源中所述的限制。All tag names must adhere to the limitations in Use tags to organize your Azure resources. 使用 --tags 参数更新节点池会更新所有现有标记值,并追加任何新标记。Updating a node pool with the --tags parameter updates any existing tag values and appends any new tags. 例如,如果节点池对标记使用了 dept=IT 和 costcenter=9999,而你使用标记的 team=dev 和 costcenter=111 更新了该节点池,则该节点池将对标记使用 dept=IT、costcenter=111 和 team=dev。 For example, if your node pool had dept=IT and costcenter=9999 for tags and you updated it with team=dev and costcenter=111 for tags, you nodepool would have dept=IT, costcenter=111, and team=dev for tags.

az aks nodepool list 命令的以下示例输出显示 tagnodepool 正在创建具有指定标记的节点: The following example output from the az aks nodepool list command shows that tagnodepool is Creating nodes with the specified tag:

az aks nodepool list -g myResourceGroup --cluster-name myAKSCluster
[
  {
    ...
    "count": 1,
    ...
    "name": "tagnodepool",
    "orchestratorVersion": "1.15.7",
    ...
    "provisioningState": "Creating",
    ...
    "tags": {
      "dept": "IT",
      "costcenter": "9999"
    },
    ...
  },
 ...
]

使用资源管理器模板管理节点池Manage node pools using a Resource Manager template

使用 Azure 资源管理器模板创建和管理资源时,通常可以更新该模板中的设置,然后重新部署以更新资源。When you use an Azure Resource Manager template to create and managed resources, you can typically update the settings in your template and redeploy to update the resource. 对于 AKS 中的节点池,一旦创建 AKS 群集,就无法更新初始节点池的配置文件。With node pools in AKS, the initial node pool profile can't be updated once the AKS cluster has been created. 此行为意味着无法更新现有的资源管理器模板,更改节点池,然后重新部署。This behavior means that you can't update an existing Resource Manager template, make a change to the node pools, and redeploy. 必须创建单独的资源管理器模板来仅更新现有 AKS 群集的节点池。Instead, you must create a separate Resource Manager template that updates only the node pools for an existing AKS cluster.

创建一个模板(例如 aks-agentpools.json)并粘贴以下示例清单。Create a template such as aks-agentpools.json and paste the following example manifest. 此示例模板配置以下设置:This example template configures the following settings:

  • 将名为 myagentpoolLinux 节点池更新为运行三个节点。Updates the Linux node pool named myagentpool to run three nodes.
  • 将节点池中的节点设置为运行 Kubernetes 版本 1.15.7。Sets the nodes in the node pool to run Kubernetes version 1.15.7.
  • 将节点大小定义为 Standard_DS2_v2Defines the node size as Standard_DS2_v2.

根据需要编辑这些值,以更新、添加或删除节点池:Edit these values as need to update, add, or delete node pools as needed:

{
    "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
    "contentVersion": "1.0.0.0",
    "parameters": {
        "clusterName": {
            "type": "string",
            "metadata": {
                "description": "The name of your existing AKS cluster."
            }
        },
        "location": {
            "type": "string",
            "metadata": {
                "description": "The location of your existing AKS cluster."
            }
        },
        "agentPoolName": {
            "type": "string",
            "defaultValue": "myagentpool",
            "metadata": {
                "description": "The name of the agent pool to create or update."
            }
        },
        "vnetSubnetId": {
            "type": "string",
            "defaultValue": "",
            "metadata": {
                "description": "The Vnet subnet resource ID for your existing AKS cluster."
            }
        }
    },
    "variables": {
        "apiVersion": {
            "aks": "2020-01-01"
        },
        "agentPoolProfiles": {
            "maxPods": 30,
            "osDiskSizeGB": 0,
            "agentCount": 3,
            "agentVmSize": "Standard_DS2_v2",
            "osType": "Linux",
            "vnetSubnetId": "[parameters('vnetSubnetId')]"
        }
    },
    "resources": [
        {
            "apiVersion": "2020-01-01",
            "type": "Microsoft.ContainerService/managedClusters/agentPools",
            "name": "[concat(parameters('clusterName'),'/', parameters('agentPoolName'))]",
            "location": "[parameters('location')]",
            "properties": {
                "maxPods": "[variables('agentPoolProfiles').maxPods]",
                "osDiskSizeGB": "[variables('agentPoolProfiles').osDiskSizeGB]",
                "count": "[variables('agentPoolProfiles').agentCount]",
                "vmSize": "[variables('agentPoolProfiles').agentVmSize]",
                "osType": "[variables('agentPoolProfiles').osType]",
                "storageProfile": "ManagedDisks",
                "type": "VirtualMachineScaleSets",
                "vnetSubnetID": "[variables('agentPoolProfiles').vnetSubnetId]",
                "orchestratorVersion": "1.15.7"
            }
        }
    ]
}

如以下示例中所示,使用 az group deployment create 命令部署此模板。Deploy this template using the az group deployment create command, as shown in the following example. 系统将提示输入现有的 AKS 群集名称和位置:You are prompted for the existing AKS cluster name and location:

az group deployment create \
    --resource-group myResourceGroup \
    --template-file aks-agentpools.json

提示

可以通过在模板中添加 tag 属性,将标记添加到节点池,如以下示例所示。You can add a tag to your node pool by adding the tag property in the template, as shown in the following example.

...
"resources": [
{
  ...
  "properties": {
    ...
    "tags": {
      "name1": "val1"
    },
    ...
  }
}
...

更新 AKS 群集可能需要花费几分钟时间,具体取决于资源管理器模板中定义的节点池设置和操作。It may take a few minutes to update your AKS cluster depending on the node pool settings and operations you define in your Resource Manager template.

清理资源Clean up resources

在本文中,你已创建包含基于 GPU 的节点的 AKS 群集。In this article, you created an AKS cluster that includes GPU-based nodes. 为了减少不必要的费用,我们建议删除 gpunodepool 或整个 AKS 群集。To reduce unnecessary cost, you may want to delete the gpunodepool, or the whole AKS cluster.

若要删除基于 GPU 的节点池,请如以下示例中所示使用 az aks nodepool delete 命令:To delete the GPU-based node pool, use the az aks nodepool delete command as shown in following example:

az aks nodepool delete -g myResourceGroup --cluster-name myAKSCluster --name gpunodepool

若要删除群集本身,请使用 az group delete 命令删除 AKS 资源组:To delete the cluster itself, use the az group delete command to delete the AKS resource group:

az group delete --name myResourceGroup --yes --no-wait

后续步骤Next steps

详细了解系统节点池Learn more about system node pools.

本文已介绍如何在 AKS 群集中创建和管理多个节点池。In this article, you learned how to create and manage multiple node pools in an AKS cluster. 有关如何跨节点池控制 pod 的详细信息,请参阅有关 AKS 中的高级计划程序功能的最佳做法For more information about how to control pods across node pools, see Best practices for advanced scheduler features in AKS.

要创建和使用 Windows Server 容器节点池,请参阅在 AKS 中创建 Windows Server 容器To create and use Windows Server container node pools, see Create a Windows Server container in AKS.