使用 GitHub Actions 自动将安全更新应用于 Azure Kubernetes 服务 (AKS) 节点Apply security updates to Azure Kubernetes Service (AKS) nodes automatically using GitHub Actions

安全更新是维护 AKS 群集安全性与合规性的关键部分,具有用于基础 OS 的最新修补程序。Security updates are a key part of maintaining your AKS cluster's security and compliance with the latest fixes for the underlying OS. 这些更新包括 OS 安全修复项或内核更新。These updates include OS security fixes or kernel updates. 一些更新需要重启节点才能完成更新过程。Some updates require a node reboot to complete the process.

运行 az aks upgrade 提供了一种无故障时间的方法来应用更新。Running az aks upgrade gives you a zero downtime way to apply updates. 该命令处理以下内容:将最新更新应用于所有群集的节点,隔离并排空流向节点的流量,重启节点,然后允许流向更新节点的流量。The command handles applying the latest updates to all your cluster's nodes, cordoning and draining traffic to the nodes, and restarting the nodes, then allowing traffic to the updated nodes. 如果使用其他方法更新节点,AKS 将不会自动重启节点。If you update your nodes using a different method, AKS will not automatically restart your nodes.

备注

--node-image-only 标志一起使用时,az aks upgrade 之间的主要区别在于,使用该标志时只会升级节点映像。The main difference between az aks upgrade when used with the --node-image-only flag is that, when it's used, only the node images will be upgraded. 如果省略,则将升级节点映像和 Kubernetes 控制平面版本。If omitted, both the node images and the Kubernetes control plane version will be upgraded. 可以查看节点的托管升级文档群集升级文档获取更深入的信息。You can check the docs for managed upgrades on nodes and the docs for cluster upgrades for more in-depth information.

所有 Kubernetes 节点都在标准 Azure 虚拟机 (VM) 中运行。All Kubernetes' nodes run in a standard Azure virtual machine (VM). 这些 VM 可以是基于 Windows 或 Linux 的 VM。These VMs can be Windows or Linux-based. 基于 Linux 的 VM 使用 Ubuntu 映像,其 OS 配置为每晚自动检查更新。The Linux-based VMs use an Ubuntu image, with the OS configured to automatically check for updates every night.

使用 az aks upgrade 命令时,Azure CLI 会使用最新安全和内核更新创建大量新节点,这些节点最初会被隔离,以防止在更新完成前向其安排任何应用。When you use the az aks upgrade command, Azure CLI creates a surge of new nodes with the latest security and kernel updates, these nodes are initially cordoned to prevent any apps from being scheduled to them until the update is finished. 完成后,Azure 会隔离(使节点无法用于新工作负载计划)旧节点并将其排空,然后取消隔离新节点,将所有计划的应用程序有效传输到新节点。After completion, Azure cordons (makes the node unavailable for scheduling of new workloads) and drains (moves the existent workloads to other node) the older nodes and uncordon the new ones, effectively transferring all the scheduled applications to the new nodes.

此过程要优于手动更新基于 Linux 的内核,因为 Linux 需要在安装新内核更新时重启。This process is better than updating Linux-based kernels manually because Linux requires a reboot when a new kernel update is installed. 如果手动更新 OS,则还需要重启 VM,手动隔离并排空所有应用。If you update the OS manually, you also need to reboot the VM, manually cordoning and draining all the apps.

本文介绍如何自动执行 AKS 节点的更新过程。This article shows you how you can automate the update process of AKS nodes. 你将使用 GitHub Actions 和 Azure CLI 根据自动运行的 cron 创建更新任务。You'll use GitHub Actions and Azure CLI to create an update task based on cron that runs automatically.

开始之前Before you begin

本文假定你拥有现有的 AKS 群集。This article assumes that you have an existing AKS cluster. 如果需要 AKS 群集,请参阅 AKS 快速入门使用 Azure CLI使用 Azure 门户If you need an AKS cluster, see the AKS quickstart using the Azure CLI or using the Azure portal.

还需安装并配置 Azure CLI 2.0.59 或更高版本。You also need the Azure CLI version 2.0.59 or later installed and configured. 运行 az --version 即可查找版本。Run az --version to find the version. 如果需要进行安装或升级,请参阅安装 Azure CLIIf you need to install or upgrade, see Install Azure CLI.

本文还假定你具有用于创建操作的 GitHub 帐户。This article also assumes you have a GitHub account to create your actions in.

创建定时 GitHub ActionCreate a timed GitHub Action

cron 是一种实用工具,使你可按自动计划运行一组命令或作业。cron is a utility that allows you to run a set of commands, or job, on an automated schedule. 若要创建作业以按自动计划更新 AKS 节点,你将需要一个存储库来托管操作。To create job to update your AKS nodes on an automated schedule, you'll need a repository to host your actions. 通常,GitHub 操作与应用程序在同一存储库中进行配置,但你可以使用任何存储库。Usually, GitHub actions are configured in the same repository as your application, but you can use any repository. 本文将使用配置文件存储库For this article we'll be using your profile repository. 如果没有该存储库,请创建一个与 GitHub 用户名同名的新存储库。If you don't have one, create a new repository with the same name as your GitHub username.

  1. 导航到 GitHub 上的存储库Navigate to your repository on GitHub

  2. 单击页面顶部的“操作”选项卡。Click on the Actions tab at the top of the page.

  3. 如果已在此存储库中设置工作流,则系统会将你定向到已完成运行的列表,在这种情况下,请单击“新建工作流”按钮。If you already set up a workflow in this repository, you'll be directed to the list of completed runs, in this case, click on the New Workflow button. 如果这是你在存储库中的第一个工作流,GitHub 将显示一些项目模板,单击说明文本下的“自行设置工作流”链接。If this is your first workflow in the repository, GitHub will present you with some project templates, click on the Set up a workflow yourself link below the description text.

  4. 更改工作流 nameon 标记,如下所示。Change the workflow name and on tags similar to the below. GitHub Actions 使用与任何基于 Linux 的系统相同的 POSIX cron 语法GitHub Actions use the same POSIX cron syntax as any Linux-based system. 在此计划中,我们要告诉工作流每 15 天在凌晨 3 点运行一次。In this schedule, we're telling the workflow to run every 15 days at 3am.

    name: Upgrade cluster node images
    on:
      schedule:
        - cron: '0 3 */15 * *'
    
  5. 使用以下内容创建新作业。Create a new job using the below. 此作业名为 upgrade-node,在 Ubuntu 代理上运行,并将连接到 Azure CLI 帐户来执行升级节点所需的步骤。This job is named upgrade-node, runs on an Ubuntu agent, and will connect to your Azure CLI account to execute the needed steps to upgrade the nodes.

    name: Upgrade cluster node images
    
    on:
      schedule:
        - cron: '0 3 */15 * *'
    
    jobs:
      upgrade-node:
        runs-on: ubuntu-latest
    

在工作流中设置 Azure CLISet up the Azure CLI in the workflow

steps 键中,你将定义工作流为升级节点将执行的所有工作。In the steps key, you'll define all the work the workflow will execute to upgrade the nodes.

下载并登录到 Azure CLI。Download and sign in to the Azure CLI.

  1. 在“GitHub Actions”屏幕的右侧,查找“市场”搜索栏并键入“Azure 登录”。On the right-hand side of the GitHub Actions screen, find the marketplace search bar and type "Azure Login".

  2. 这样一来,你将获得由 Azure 发布的名为“Azure 登录”的操作 :You'll get as a result, an Action called Azure Login published by Azure:

    显示两行的搜索结果,第一个操作名为“Azure 登录”,第二个操作名为“Azure 容器注册表登录”

  3. 单击“Azure 登录”。Click on Azure Login. 在下一屏幕上,单击代码示例右上方的“复制”图标。On the next screen, click the copy icon in the top right of the code sample.

    “Azure 登录操作”结果窗格,下方带有代码示例,“复制”图标周围的红色方框突出显示了单击位置

  4. 将以下内容粘贴到 steps 键下:Paste the following under the steps key:

    name: Upgrade cluster node images
    
    on:
      schedule:
        - cron: '0 3 */15 * *'
    
    jobs:
      upgrade-node:
        runs-on: ubuntu-latest
    
        steps:
          - name: Azure Login
            uses: Azure/login@v1.1
            with:
              creds: ${{ secrets.AZURE_CREDENTIALS }}
    
  5. 在 Azure CLI 中运行以下命令,以生成新的用户名和密码。From the Azure CLI, run the following command to generate a new username and password.

    az ad sp create-for-rbac -o json
    

    输出应类似于以下 json:The output should be similar to the following json:

    {
      "appId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
      "displayName": "azure-cli-xxxx-xx-xx-xx-xx-xx",
      "name": "http://azure-cli-xxxx-xx-xx-xx-xx-xx",
      "password": "xXxXxXxXx",
      "tenant": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
    }
    
  6. 在新的浏览器窗口中,导航到 GitHub 存储库并打开存储库的“设置”选项卡 。In a new browser window navigate to your GitHub repository and open the Settings tab of the repository. 单击“机密”,然后单击“新建存储库机密” 。Click Secrets then, click on New Repository Secret.

  7. 使用 AZURE_CREDENTIALS 作为“名称”。For Name, use AZURE_CREDENTIALS.

  8. 对于“值”,请添加上一步(你在这一步中创建了新用户名和密码)的输出中的全部内容。For Value, add the entire contents from the output of the previous step where you created a new username and password.

    窗体显示了将 AZURE_CREDENTIALS 作为机密标题,并以 JSON 的形式粘贴已执行命令的输出

  9. 单击“添加机密”。Click Add Secret.

操作使用的 CLI 将被记录到 Azure 帐户并准备好运行命令。The CLI used by your action will be logged to your Azure account and ready to run commands.

如何创建执行 Azure CLI 命令的步骤:To create the steps to execute Azure CLI commands.

  1. 在屏幕右侧导航到 GitHub 市场上的“搜索”页,并搜索“Azure CLI 操作” 。Navigate to the search page on GitHub marketplace on the right-hand side of the screen and search Azure CLI Action. 选择“Azure 执行的 Azure CLI 操作”。Choose Azure CLI Action by Azure.

    “Azure CLI 操作”的搜索结果,第一个结果显示为 Azure 执行的操作

  2. 单击 GitHub 市场结果上的“复制”按钮,并将操作的内容粘贴到“Azure 登录”步骤下的主编辑器中,类似于以下所示内容 :Click the copy button on the GitHub marketplace result and paste the contents of the action in the main editor, below the Azure Login step, similar to the following:

    name: Upgrade cluster node images
    
    on:
      schedule:
        - cron: '0 3 */15 * *'
    
    jobs:
      upgrade-node:
        runs-on: ubuntu-latest
    
        steps:
          - name: Azure Login
            uses: Azure/login@v1.1
            with:
              creds: ${{ secrets.AZURE_CREDENTIALS }}
              environment: 'AzureChinaCloud'
          - name: Upgrade node images
            uses: Azure/cli@v1.0.0
            with:
              inlineScript: az aks upgrade -g {resourceGroupName} -n {aksClusterName} --node-image-only --yes
    

    备注

    可针对 Azure 中国云和 Azure Stack Hub 利用此 Azure 登录操作(使用环境参数)。You can leverage this Azure login action for the Azure China Clouds and Azure Stack Hub (using the environment parameter).

    • Azure 中国云:environment: 'AzureChinaCloud'Azure China Cloud: environment: 'AzureChinaCloud'

    提示

    可以通过将 -g-n 参数添加到机密来将其与命令分离,这与前面的步骤类似。You can decouple the -g and -n parameters from the command by adding them to secrets similar to the previous steps. {resourceGroupName}{aksClusterName} 占位符替换为其机密对应项,例如 ${{secrets.RESOURCE_GROUP_NAME}}${{secrets.AKS_CLUSTER_NAME}}Replace the {resourceGroupName} and {aksClusterName} placeholders by their secret counterparts, for example ${{secrets.RESOURCE_GROUP_NAME}} and ${{secrets.AKS_CLUSTER_NAME}}

  3. 将此文件重命名为 upgrade-node-imagesRename the file to upgrade-node-images.

  4. 单击“开始提交”,添加消息标题,并保存工作流。Click Start Commit, add a message title, and save the workflow.

创建提交后,将保存工作流并准备用于执行。Once you create the commit, the workflow will be saved and ready for execution.

备注

若要升级单个节点池而不是群集上的所有节点池,请将 --name 参数添加到 az aks nodepool upgrade 命令,以指定节点池名称。To upgrade a single node pool instead of all node pools on the cluster, add the --name parameter to the az aks nodepool upgrade command to specify the node pool name. 例如:For example:

az aks nodepool upgrade -g {resourceGroupName} --cluster-name {aksClusterName} --name {{nodePoolName}} --node-image-only

手动运行 GitHub ActionRun the GitHub Action manually

除了计划的运行之外,还可以通过添加名为 workflow_dispatch 的新 on 触发器手动运行工作流。You can run the workflow manually, in addition to the scheduled run, by adding a new on trigger called workflow_dispatch. 完成的文件应如以下 YAML 所示:The finished file should look like the YAML below:

name: Upgrade cluster node images

on:
  schedule:
    - cron: '0 3 */15 * *'
  workflow_dispatch:

jobs:
  upgrade-node:
    runs-on: ubuntu-latest

    steps:
      - name: Azure Login
        uses: Azure/login@v1.1
        with:
          creds: ${{ secrets.AZURE_CREDENTIALS }}
          environment: 'AzureChinaCloud'

      # Code for upgrading one or more node pools

后续步骤Next steps