使用 SSH 连接到 Azure Kubernetes 服务 (AKS) 群集节点以进行维护或故障排除Connect with SSH to Azure Kubernetes Service (AKS) cluster nodes for maintenance or troubleshooting

在 Azure Kubernetes 服务 (AKS) 群集的整个生命周期内,可能需要访问 AKS 节点。Throughout the lifecycle of your Azure Kubernetes Service (AKS) cluster, you may need to access an AKS node. 进行这种访问的原因包括维护、日志收集或其他故障排除操作。This access could be for maintenance, log collection, or other troubleshooting operations. 可以使用 SSH 访问 AKS 节点。You can access AKS nodes using SSH. 出于安全考虑,AKS 节点不会在 Internet 中公开。For security purposes, the AKS nodes aren't exposed to the internet. 若要通过 SSH 连接到 AKS 节点,需使用专用 IP 地址。To SSH to the AKS nodes, you use the private IP address.

本文介绍如何使用 AKS 节点的专用 IP 地址来与它们建立 SSH 连接。This article shows you how to create an SSH connection with an AKS node using their private IP addresses.

准备阶段Before you begin

本文假定你拥有现有的 AKS 群集。This article assumes that you have an existing AKS cluster. 如果需要 AKS 群集,请参阅 AKS 快速入门使用 Azure CLI使用 Azure 门户If you need an AKS cluster, see the AKS quickstart using the Azure CLI or using the Azure portal.

默认情况下,在创建 AKS 群集时会获取或生成 SSH 密钥,然后将其添加到节点。By default, SSH keys are obtained, or generated, then added to nodes when you create an AKS cluster. 本文介绍如何指定与创建 AKS 群集时使用的 SSH 密钥不同的 SSH 密钥。This article shows you how to specify different SSH keys than the SSH keys used when you created your AKS cluster. 此外介绍如何确定节点的专用 IP 地址,并使用 SSH 连接到该节点。The article also shows you how to determine the private IP address of your node and connect to it using SSH. 如果不需要指定不同的 SSH 密钥,则可以跳过将 SSH 公钥添加到节点的步骤。If you don't need to specify a different SSH key, then you may skip the step for adding the SSH public key to the node.

本文假设你已有一个 SSH 密钥。This article also assumes you have an SSH key. 可以使用 macOS 或 Linux 创建 SSH 密钥。You can create an SSH key using macOS or Linux . 如果使用 PuTTY Gen 来创建密钥对,请在保存密钥对时使用 OpenSSH 格式而不是默认的 PuTTy 私钥格式(.ppk 文件)。If you use PuTTY Gen to create the key pair, save the key pair in an OpenSSH format rather than the default PuTTy private key format (.ppk file).

还需安装并配置 Azure CLI 2.0.64 或更高版本。You also need the Azure CLI version 2.0.64 or later installed and configured. 运行  az --version 即可查找版本。Run az --version to find the version. 如果需要进行安装或升级,请参阅 安装 Azure CLIIf you need to install or upgrade, see Install Azure CLI.

配置基于虚拟机规模集的 AKS 群集以进行 SSH 访问Configure virtual machine scale set-based AKS clusters for SSH access

若要配置基于虚拟机规模集的 SSH 访问,请找到群集的虚拟机规模集名称,并将 SSH 公钥添加到该规模集。To configure your virtual machine scale set-based for SSH access, find the name of your cluster's virtual machine scale set and add your SSH public key to that scale set.

使用 az aks show 命令获取 AKS 群集的资源组名称,然后使用 az vmss list 命令获取规模集的名称。Use the az aks show command to get the resource group name of your AKS cluster, then the az vmss list command to get the name of your scale set.

CLUSTER_RESOURCE_GROUP=$(az aks show --resource-group myResourceGroup --name myAKSCluster --query nodeResourceGroup -o tsv)
SCALE_SET_NAME=$(az vmss list --resource-group $CLUSTER_RESOURCE_GROUP --query [0].name -o tsv)

以上示例将 myResourceGroupmyAKSCluster 的群集资源组名称分配到 CLUSTER_RESOURCE_GROUPThe above example assigns the name of the cluster resource group for the myAKSCluster in myResourceGroup to CLUSTER_RESOURCE_GROUP. 然后,该示例使用 CLUSTER_RESOURCE_GROUP 列出规模集名称并将其分配到 SCALE_SET_NAMEThe example then uses CLUSTER_RESOURCE_GROUP to list the scale set name and assign it to SCALE_SET_NAME.

Note

SSH 密钥目前只能使用 Azure CLI 添加到 Linux 节点。SSH keys can currently only be added to Linux nodes using the Azure CLI.

若要将 SSH 密钥添加到虚拟机规模集中的节点,请使用 az vmss extension setaz vmss update-instances 命令。To add your SSH keys to the nodes in a virtual machine scale set, use the az vmss extension set and az vmss update-instances commands.

az vmss extension set  \
    --resource-group $CLUSTER_RESOURCE_GROUP \
    --vmss-name $SCALE_SET_NAME \
    --name VMAccessForLinux \
    --publisher Microsoft.OSTCExtensions \
    --version 1.4 \
    --protected-settings "{\"username\":\"azureuser\", \"ssh_key\":\"$(cat ~/.ssh/id_rsa.pub)\"}"

az vmss update-instances --instance-ids '*' \
    --resource-group $CLUSTER_RESOURCE_GROUP \
    --name $SCALE_SET_NAME

以上示例使用前面命令中的 CLUSTER_RESOURCE_GROUPSCALE_SET_NAME 变量。The above example uses the CLUSTER_RESOURCE_GROUP and SCALE_SET_NAME variables from the previous commands. 以上示例还使用 ~/.ssh/id_rsa.pub 作为 SSH 公钥的位置。The above example also uses ~/.ssh/id_rsa.pub as the location for your SSH public key.

Note

默认情况下,AKS 节点的用户名为 azureuserBy default, the username for the AKS nodes is azureuser.

将 SSH 公钥添加到规模集后,可以通过 SSH 连接到该规模集中的节点虚拟机(使用其 IP 地址进行连接)。After you add your SSH public key to the scale set, you can SSH into a node virtual machine in that scale set using its IP address. 使用 kubectl get 命令查看 AKS 群集节点的专用 IP 地址。View the private IP addresses of the AKS cluster nodes using the kubectl get command.

kubectl get nodes -o wide

以下示例输出显示了群集中所有节点的内部 IP 地址。The follow example output shows the internal IP addresses of all the nodes in the cluster.

$ kubectl get nodes -o wide

NAME                                STATUS   ROLES   AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                    KERNEL-VERSION      CONTAINER-RUNTIME
aks-nodepool1-42485177-vmss000000   Ready    agent   18h   v1.12.7   10.240.0.4    <none>        Ubuntu 16.04.6 LTS          4.15.0-1040-azure   docker://3.0.4

记录要进行故障排除的节点的内部 IP 地址。Record the internal IP address of the node you wish to troubleshoot.

若要使用 SSH 访问节点,请遵循创建 SSH 连接中的步骤。To access your node using SSH, follow the steps in Create the SSH connection.

配置基于虚拟机可用性集的 AKS 群集以进行 SSH 访问Configure virtual machine availability set-based AKS clusters for SSH access

若要配置基于虚拟机可用性集的 AKS 群集以进行 SSH 访问,请找到群集的 Linux 节点名称,并将 SSH 公钥添加到该节点。To configure your virtual machine availability set-based AKS cluster for SSH access, find the name of your cluster's Linux node, and add your SSH public key to that node.

使用 az aks show 命令获取 AKS 群集的资源组名称,然后使用 az vmss list 命令列出群集 Linux 节点的虚拟机名称。Use the az aks show command to get the resource group name of your AKS cluster, then the az vm list command to list the virtual machine name of your cluster's Linux node.

CLUSTER_RESOURCE_GROUP=$(az aks show --resource-group myResourceGroup --name myAKSCluster --query nodeResourceGroup -o tsv)
az vm list --resource-group $CLUSTER_RESOURCE_GROUP -o table

以上示例将 myResourceGroupmyAKSCluster 的群集资源组名称分配到 CLUSTER_RESOURCE_GROUPThe above example assigns the name of the cluster resource group for the myAKSCluster in myResourceGroup to CLUSTER_RESOURCE_GROUP. 然后,该示例使用 CLUSTER_RESOURCE_GROUP 列出虚拟机名称。The example then uses CLUSTER_RESOURCE_GROUP to list the virtual machine name. 示例输出显示了虚拟机的名称:The example output shows the name of the virtual machine:

Name                      ResourceGroup                                  Location
------------------------  ---------------------------------------------  ----------
aks-nodepool1-79590246-0  MC_myResourceGroupAKS_myAKSClusterRBAC_chinaeast2  chinaeast2

若要将 SSH 密钥添加到节点,请使用 az vm user update 命令。To add your SSH keys to the node, use the az vm user update command.

az vm user update \
    --resource-group $CLUSTER_RESOURCE_GROUP \
    --name aks-nodepool1-79590246-0 \
    --username azureuser \
    --ssh-key-value ~/.ssh/id_rsa.pub

以上示例使用前面命令中的 CLUSTER_RESOURCE_GROUP 变量和节点虚拟机名称。The above example uses the CLUSTER_RESOURCE_GROUP variable and the node virtual machine name from previous commands. 以上示例还使用 ~/.ssh/id_rsa.pub 作为 SSH 公钥的位置。The above example also uses ~/.ssh/id_rsa.pub as the location for your SSH public key. 也可以使用 SSH 公钥内容,而不是指定路径。You could also use the contents of your SSH public key instead of specifying a path.

Note

默认情况下,AKS 节点的用户名为 azureuserBy default, the username for the AKS nodes is azureuser.

将 SSH 公钥添加到节点虚拟机后,可以通过 SSH 连接到该虚拟机(使用其 IP 地址进行连接)。After you add your SSH public key to the node virtual machine, you can SSH into that virtual machine using its IP address. 使用 az vm list-ip-addresses 命令查看 AKS 群集节点的专用 IP 地址。View the private IP address of an AKS cluster node using the az vm list-ip-addresses command.

az vm list-ip-addresses --resource-group $CLUSTER_RESOURCE_GROUP -o table

以上示例使用前面命令中设置的 CLUSTER_RESOURCE_GROUP 变量。The above example uses the CLUSTER_RESOURCE_GROUP variable set in the previous commands. 以下示例输出显示 AKS 节点的专用 IP 地址:The following example output shows the private IP addresses of the AKS nodes:

VirtualMachine            PrivateIPAddresses
------------------------  --------------------
aks-nodepool1-79590246-0  10.240.0.4

建立 SSH 连接Create the SSH connection

若要与 AKS 节点建立 SSH 连接,请在 AKS 群集中运行帮助器 pod。To create an SSH connection to an AKS node, you run a helper pod in your AKS cluster. 使用此帮助器 pod 可以通过 SSH 依次访问群集和其他 SSH 节点。This helper pod provides you with SSH access into the cluster and then additional SSH node access. 若要创建并使用此帮助器 pod,请完成以下步骤:To create and use this helper pod, complete the following steps:

  1. 运行 debian 容器映像,并在其上附加一个终端会话。Run a debian container image and attach a terminal session to it. 可以使用此容器来与 AKS 群集中的任何节点建立 SSH 会话:This container can be used to create an SSH session with any node in the AKS cluster:

    kubectl run -it --rm aks-ssh --image=debian
    
  2. 将终端会话连接到该容器后,使用 apt-get 安装 SSH 客户端:Once the terminal session is connected to the container, install an SSH client using apt-get:

    apt-get update && apt-get install openssh-client -y
    
  3. 打开一个未连接到容器的新终端窗口,使用 kubectl get pod 命令列出 AKS 群集中的 Pod。Open a new terminal window, not connected to your container, list the pods on your AKS cluster using the kubectl get pods command. 在上一步骤中创建的 pod 以名称 aks-ssh 开头,如以下示例所示:The pod created in the previous step starts with the name aks-ssh, as shown in the following example:

    $ kubectl get pods
    
    NAME                       READY     STATUS    RESTARTS   AGE
    aks-ssh-554b746bcf-kbwvf   1/1       Running   0          1m
    
  4. 在前面的步骤中,你已将 SSH 公钥添加到要进行故障排除的 AKS 节点。In an earlier step, you added your public SSH key to the AKS node you wanted to troubleshoot. 现在,请将 SSH 私钥复制到帮助器 Pod 中。Now, copy your private SSH key into the helper pod. 此私钥用来与 AKS 节点建立 SSH 连接。This private key is used to create the SSH into the AKS node.

    提供在上一步骤中获取的自己的 aks-ssh pod 名称。Provide your own aks-ssh pod name obtained in the previous step. 如果需要,请将 ~/.ssh/id_rsa 更改为 SSH 私钥的位置:If needed, change ~/.ssh/id_rsa to location of your private SSH key:

    kubectl cp ~/.ssh/id_rsa aks-ssh-554b746bcf-kbwvf:/id_rsa
    
  5. 返回到与容器建立的终端会话,更新复制的 id_rsa SSH 私钥中的权限,使其成为用户只读的密钥:Return to the terminal session to your container, update the permissions on the copied id_rsa private SSH key so that it is user read-only:

    chmod 0600 id_rsa
    
  6. 与 AKS 节点建立 SSH 连接。Create an SSH connection to your AKS node. 同样,AKS 节点的默认用户名为 azureuserAgain, the default username for AKS nodes is azureuser. 遵照提示继续建立连接,因为一开始就信任 SSH 密钥。Accept the prompt to continue with the connection as the SSH key is first trusted. 然后,系统会提供 AKS 节点的 bash 提示:You are then provided with the bash prompt of your AKS node:

    $ ssh -i id_rsa azureuser@10.240.0.4
    
    ECDSA key fingerprint is SHA256:A6rnRkfpG21TaZ8XmQCCgdi9G/MYIMc+gFAuY9RUY70.
    Are you sure you want to continue connecting (yes/no)? yes
    Warning: Permanently added '10.240.0.4' (ECDSA) to the list of known hosts.
    
    Welcome to Ubuntu 16.04.5 LTS (GNU/Linux 4.15.0-1018-azure x86_64)
    
     * Documentation:  https://help.ubuntu.com
     * Management:     https://landscape.canonical.com
     * Support:        https://ubuntu.com/advantage
    
      Get cloud support with Ubuntu Advantage Cloud Guest:
        https://www.ubuntu.com/business/services/cloud
    
    [...]
    
    azureuser@aks-nodepool1-79590246-0:~$
    

删除 SSH 访问Remove SSH access

完成后,运行 exit 退出 SSH 会话,然后运行 exit 退出交互式容器会话。When done, exit the SSH session and then exit the interactive container session. 此容器会话关闭后,将删除用于从 AKS 群集进行 SSH 访问的 pod。When this container session closes, the pod used for SSH access from the AKS cluster is deleted.

后续步骤Next steps

如需其他故障排除数据,可以查看 kubelet 日志If you need additional troubleshooting data, you can view the kubelet logs.