共用方式為

控制集群内资源放置的顺序

适用于: ✔️具有中心群集的 Fleet Manager

Azure Kubernetes Fleet Manager 放置分阶段更新运行提供了一种受控的方法,通过分阶段流程在多个成员群集上部署 Kubernetes 工作负载。 为了最大程度地降低风险,此方法按顺序部署到目标群集,并在阶段之间提供可选的等待时间和审批入口。

本文介绍如何创建和执行暂存更新运行,以逐步部署工作负载,并在需要时回滚到以前的版本。

Azure Kubernetes Fleet Manager 支持两个分阶段更新的范围:

  • 群集范围ClusterStagedUpdateRunClusterResourcePlacement 用于管理基础设施级更改的集群管理员。
  • 命名空间范围(预览版):应用程序团队在其特定命名空间内使用StagedUpdateRunResourcePlacement来管理发布。

本文中的示例演示了使用选项卡的这两种方法。 选择与部署范围匹配的选项卡。

重要

ResourcePlacement 使用 placement.kubernetes-fleet.io/v1beta1 API 版本,目前以预览版提供。 本文中演示的某些功能(例如 StagedUpdateStrategy)也是 v1beta1 API 的一部分,在 v1 API 中不可用。

在您开始之前

先决条件

  • 需要具有活动订阅的 Azure 帐户。 创建帐户

  • 若要了解本文中使用的概念和术语,请阅读 分阶段推出策略的概念概述

  • 需要安装 Azure CLI 2.58.0 或更高版本才能完成本文。 若要安装或升级,请参阅 安装 Azure CLI

  • 如果还没有 Kubernetes CLI (kubectl),可以使用以下命令安装它:

    az aks install-cli
    
  • 需要 fleet Azure CLI 扩展。 可以通过运行以下命令来安装它:

    az extension add --name fleet
    

    运行命令 az extension update 以更新到最新版本的扩展:

    az extension update --name fleet
    

配置演示环境

此演示在具有中心群集和三个成员群集的 Fleet Manager 上运行。 如果没有,请按照 快速入门 创建包含中心群集的机群管理器。 然后,将 Azure Kubernetes 服务 (AKS) 群集加入为成员。

本教程演示了使用具有以下标签的三个成员群集的演示机群环境的分阶段更新运行:

群集名称 标签
member1 environment=canary, order=2
member2 environment=staging
member3 environment=canary, order=1

为了按环境对群集进行分组并控制每个阶段内的部署顺序,这些标签允许我们创建阶段。

准备 Kubernetes 工作负载以供放置

将 Kubernetes 工作负载发布到中心群集,以便将其放置在成员群集上。

为中心群集上的工作负荷创建命名空间和 ConfigMap:

kubectl create namespace test-namespace
kubectl create configmap test-cm --from-literal=key=value1 -n test-namespace

若要部署资源,请创建一个 ClusterResourcePlacement

注释

设置为spec.strategy.typeExternal允许通过 a ClusterStagedUpdateRun. 触发的推出。

apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterResourcePlacement
metadata:
  name: example-placement
spec:
  resourceSelectors:
    - group: ""
      kind: Namespace
      name: test-namespace
      version: v1
  policy:
    placementType: PickAll
  strategy:
    type: External

应计划这三个群集,因为我们使用 PickAll 策略,但尚未在成员群集上部署任何资源,因为我们没有创建 ClusterStagedUpdateRun资源。

验证放置是否计划:

kubectl get clusterresourceplacement example-placement

输出应类似于以下示例:

NAME                GEN   SCHEDULED   SCHEDULED-GEN   AVAILABLE   AVAILABLE-GEN   AGE
example-placement   1     True        1                                           51s

使用资源快照

当资源发生更改时,Fleet Manager 会创建资源快照。 每个快照都有一个唯一索引,可用于引用特定版本的资源。

小窍门

有关资源快照及其工作原理的详细信息,请参阅 了解资源快照

检查当前资源快照

检查当前资源快照:

kubectl get clusterresourcesnapshots --show-labels

输出应类似于以下示例:

NAME                           GEN   AGE   LABELS
example-placement-0-snapshot   1     60s   kubernetes-fleet.io/is-latest-snapshot=true,kubernetes-fleet.io/parent-CRP=example-placement,kubernetes-fleet.io/resource-index=0

我们只有一个版本的快照。 这是最新的 (kubernetes-fleet.io/is-latest-snapshot=true) 并具有资源索引 0 (kubernetes-fleet.io/resource-index=0)。

创建新的资源快照

现在,使用新值修改 ConfigMap:

kubectl edit configmap test-cm -n test-namespace

将值更新 value1value2

kubectl get configmap test-cm -n test-namespace -o yaml

输出应类似于以下示例:

apiVersion: v1
data:
  key: value2 # value updated here, old value: value1
kind: ConfigMap
metadata:
  creationTimestamp: ...
  name: test-cm
  namespace: test-namespace
  resourceVersion: ...
  uid: ...

现在,应会看到两个版本的资源快照,其索引分别为 0 和 1,最新为索引 1:

kubectl get clusterresourcesnapshots --show-labels

输出应类似于以下示例:

NAME                           GEN   AGE    LABELS
example-placement-0-snapshot   1     2m6s   kubernetes-fleet.io/is-latest-snapshot=false,kubernetes-fleet.io/parent-CRP=example-placement,kubernetes-fleet.io/resource-index=0
example-placement-1-snapshot   1     10s    kubernetes-fleet.io/is-latest-snapshot=true,kubernetes-fleet.io/parent-CRP=example-placement,kubernetes-fleet.io/resource-index=1

最新标签设置为 example-placement-1-snapshot,其中包含最新的 ConfigMap 数据:

kubectl get clusterresourcesnapshots example-placement-1-snapshot -o yaml

输出应类似于以下示例:

apiVersion: placement.kubernetes-fleet.io/v1
kind: ClusterResourceSnapshot
metadata:
  annotations:
    kubernetes-fleet.io/number-of-enveloped-object: "0"
    kubernetes-fleet.io/number-of-resource-snapshots: "1"
    kubernetes-fleet.io/resource-hash: 10dd7a3d1e5f9849afe956cfbac080a60671ad771e9bda7dd34415f867c75648
  creationTimestamp: "2025-07-22T21:26:54Z"
  generation: 1
  labels:
    kubernetes-fleet.io/is-latest-snapshot: "true"
    kubernetes-fleet.io/parent-CRP: example-placement
    kubernetes-fleet.io/resource-index: "1"
  name: example-placement-1-snapshot
  ownerReferences:
  - apiVersion: placement.kubernetes-fleet.io/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: ClusterResourcePlacement
    name: example-placement
    uid: e7d59513-b3b6-4904-864a-c70678fd6f65
  resourceVersion: "19994"
  uid: 79ca0bdc-0b0a-4c40-b136-7f701e85cdb6
spec:
  selectedResources:
  - apiVersion: v1
    kind: Namespace
    metadata:
      labels:
        kubernetes.io/metadata.name: test-namespace
      name: test-namespace
    spec:
      finalizers:
      - kubernetes
  - apiVersion: v1
    data:
      key: value2 # latest value: value2, old value: value1
    kind: ConfigMap
    metadata:
      name: test-cm
      namespace: test-namespace

创建分阶段更新策略

定义 ClusterStagedUpdateStrategy 将群集分组到阶段并指定推出序列的业务流程模式。 它按标签选择成员群集。 对于我们的演示,我们将创建一个包含两个阶段的暂存和 Canary:

apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateStrategy
metadata:
  name: example-strategy
spec:
  stages:
    - name: staging
      labelSelector:
        matchLabels:
          environment: staging
      afterStageTasks:
        - type: TimedWait
          waitTime: 1m
      maxConcurrency: 1
    - name: canary
      labelSelector:
        matchLabels:
          environment: canary
      sortingLabelKey: order
      beforeStageTasks:
        - type: Approval
      maxConcurrency: 50%

准备分阶段更新运行以实施更改

执行ClusterStagedUpdateRun以下项ClusterResourcePlacementClusterStagedUpdateStrategy的推出。 若要触发 ClusterResourcePlacement(CRP)的暂存更新运行,我们将创建一个 ClusterStagedUpdateRun 指定 CRP 名称、updateRun 策略名称、最新资源快照索引(“1”)和状态为“Initialize”:

apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateRun
metadata:
  name: example-run
spec:
  placementName: example-placement
  resourceSnapshotIndex: "1"
  stagedRolloutStrategyName: example-strategy
  state: Initialize

分阶段更新运行已初始化,但未运行:

kubectl get clusterstagedupdaterrun example-run

输出应类似于以下示例:

NAME          PLACEMENT           RESOURCE-SNAPSHOT-INDEX   POLICY-SNAPSHOT-INDEX   INITIALIZED   SUCCEEDED   AGE
example-run   example-placement   1                         0                       True                      7s

启动分阶段更新运行

若要开始执行分阶段更新运行,需要将规范中的字段修改 stateRun

kubectl patch clusterstagedupdaterrun example-run --type merge -p '{"spec":{"state":"Run"}}'

注释

还可以创建一个更新运行,并将state字段最初设置为Run,这样可以在一个步骤中同时初始化并启动更新运行。

分阶段更新运行已初始化并运行:

kubectl get clusterstagedupdaterrun example-run

输出应类似于以下示例:

NAME          PLACEMENT           RESOURCE-SNAPSHOT-INDEX   POLICY-SNAPSHOT-INDEX   INITIALIZED   SUCCEEDED   AGE
example-run   example-placement   1                         0                       True                      7s

更详细地查看一分钟 TimedWait 后的状态:

kubectl get clusterstagedupdaterrun example-run -o yaml

输出应类似于以下示例:

apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateRun
metadata:
  ...
  name: example-run
  ...
spec:
  placementName: example-placement
  resourceSnapshotIndex: "1"
  stagedRolloutStrategyName: example-strategy
  state: Run
status:
  conditions:
  - lastTransitionTime: "2025-07-22T21:28:08Z"
    message: ClusterStagedUpdateRun initialized successfully
    observedGeneration: 2
    reason: UpdateRunInitializedSuccessfully
    status: "True" # the updateRun is initialized successfully
    type: Initialized
  - lastTransitionTime: "2025-07-22T21:29:53Z"
    message: The updateRun is waiting for after-stage tasks in stage canary to complete
    observedGeneration: 2
    reason: UpdateRunWaiting
    status: "False" # the updateRun is still progressing and waiting for approval
    type: Progressing
  deletionStageStatus:
    clusters: [] # no clusters need to be cleaned up
    stageName: kubernetes-fleet.io/deleteStage
  policyObservedClusterCount: 3 # number of clusters to be updated
  policySnapshotIndexUsed: "0"
  resourceSnapshotIndexUsed: "1"
  stagedUpdateStrategySnapshot: # snapshot of the strategy used for this update run
    stages:
    - afterStageTasks:
      - type: TimedWait
        waitTime: 1m0s
      labelSelector:
        matchLabels:
          environment: staging
      maxConcurrency: 1
      name: staging
    - beforeStageTasks:
      - type: Approval
      labelSelector:
        matchLabels:
          environment: canary
      maxConcurrency: 50%
      name: canary
      sortingLabelKey: order
  stagesStatus: # detailed status for each stage
  - afterStageTaskStatus:
    - conditions:
      - lastTransitionTime: "2025-07-22T21:29:23Z"
        message: Wait time elapsed
        observedGeneration: 2
        reason: StageTaskWaitTimeElapsed
        status: "True" # the wait after-stage task has completed
        type: WaitTimeElapsed
      type: TimedWait
    clusters:
    - clusterName: member2 # stage staging contains member2 cluster only
      conditions:
      - lastTransitionTime: "2025-07-22T21:28:08Z"
        message: Cluster update started
        observedGeneration: 2
        reason: ClusterUpdatingStarted
        status: "True"
        type: Started
      - lastTransitionTime: "2025-07-22T21:28:23Z"
        message: Cluster update completed successfully
        observedGeneration: 2
        reason: ClusterUpdatingSucceeded
        status: "True" # member2 is updated successfully
        type: Succeeded
    conditions:
    - lastTransitionTime: "2025-07-22T21:28:23Z"
      message: All clusters in the stage are updated and after-stage tasks are completed
      observedGeneration: 2
      reason: StageUpdatingSucceeded
      status: "False"
      type: Progressing
    - lastTransitionTime: "2025-07-22T21:29:23Z"
      message: Stage update completed successfully
      observedGeneration: 2
      reason: StageUpdatingSucceeded
      status: "True" # stage staging has completed successfully
      type: Succeeded
    endTime: "2025-07-22T21:29:23Z"
    stageName: staging
    startTime: "2025-07-22T21:28:08Z"
  - beforeStageTaskStatus:
    - approvalRequestName: example-run-before-canary # ClusterApprovalRequest name for this stage for before stage task
      conditions:
      - lastTransitionTime: "2025-07-22T21:29:53Z"
        message: ClusterApprovalRequest is created
        observedGeneration: 2
        reason: StageTaskApprovalRequestCreated
        status: "True"
        type: ApprovalRequestCreated
      type: Approval
    conditions:
    - lastTransitionTime: "2025-07-22T21:29:53Z"
      message: Not all before-stage tasks are completed, waiting for approval
      observedGeneration: 2
      reason: StageUpdatingWaiting
      status: "False" # stage canary is waiting for approval task completion
      type: Progressing
    stageName: canary
    startTime: "2025-07-22T21:29:23Z"

我们可以看到暂存阶段的 TimedWait 时间段已结束,我们还会看到 Canary 阶段中审批任务的对象已创建。 我们可以检查生成的 ClusterApprovalRequest,并确认还没有人批准它。

kubectl get clusterapprovalrequest

输出应类似于以下示例:

NAME                        UPDATE-RUN    STAGE    APPROVED   APPROVALACCEPTED   AGE
example-run-before-canary   example-run   canary                                 2m39s

批准分阶段更新运行

可以通过创建 json 修补程序文件并应用它来批准 ClusterApprovalRequest 该文件:

cat << EOF > approval.json
"status": {
    "conditions": [
        {
            "lastTransitionTime": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",
            "message": "lgtm",
            "observedGeneration": 1,
            "reason": "testPassed",
            "status": "True",
            "type": "Approved"
        }
    ]
}
EOF

提交修补程序请求,以使用创建的 JSON 文件进行批准。

kubectl patch clusterapprovalrequests example-run-before-canary --type='merge' --subresource=status --patch-file approval.json

然后验证是否已批准请求:

kubectl get clusterapprovalrequest

输出应类似于以下示例:

NAME                        UPDATE-RUN    STAGE    APPROVED   APPROVALACCEPTED   AGE
example-run-before-canary   example-run   canary   True       True               3m35s

现在 ClusterStagedUpdateRun 能够继续并完成:

kubectl get clusterstagedupdaterrun example-run

输出应类似于以下示例:

NAME          PLACEMENT           RESOURCE-SNAPSHOT-INDEX   POLICY-SNAPSHOT-INDEX   INITIALIZED   SUCCEEDED   AGE
example-run   example-placement   1                         0                       True          True        5m28s

验证部署完成

此外 ClusterResourcePlacement ,还显示已完成的推出和资源在所有成员群集上可用:

kubectl get clusterresourceplacement example-placement

输出应类似于以下示例:

NAME                GEN   SCHEDULED   SCHEDULED-GEN   AVAILABLE   AVAILABLE-GEN   AGE
example-placement   1     True        1               True        1               8m55s

应在所有三个成员群集中部署 ConfigMap test-cm,并且数据应为最新版本。

apiVersion: v1
data:
  key: value2
kind: ConfigMap
metadata:
  ...
  name: test-cm
  namespace: test-namespace
  ...

停止分阶段更新过程

若要停止正在运行的群集暂存更新的运行,需要将规范中的state字段修补为Stop。 该操作会优雅地暂停更新运行,让正在进行中的群集在整个发布过程停止之前完成更新。

kubectl patch clusterstagedupdaterun example-run --type merge -p '{"spec":{"state":"Stop"}}'

分阶段更新运行已初始化,不再运行:

kubectl get clusterstagedupdaterun example-run

输出应类似于以下示例:

NAME          PLACEMENT           RESOURCE-SNAPSHOT-INDEX   POLICY-SNAPSHOT-INDEX   INITIALIZED   SUCCEEDED   AGE
example-run   example-placement   1                         0                       True                      7s

更新完成后,可以更详细地查看状态:

kubectl get clusterstagedupdaterun example-run -o yaml

输出应类似于以下示例:

apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateRun
metadata:
  ...
  name: example-run
  ...
spec:
  placementName: example-placement
  resourceSnapshotIndex: "1"
  stagedRolloutStrategyName: example-strategy
  state: Stop
status:
  conditions:
  - lastTransitionTime: "2025-07-22T21:28:08Z"
    message: ClusterStagedUpdateRun initialized successfully
    observedGeneration: 3
    reason: UpdateRunInitializedSuccessfully
    status: "True" # the updateRun is initialized successfully
    type: Initialized
  - lastTransitionTime: "2025-07-22T21:28:23Z"
    message: The update run has been stopped
    observedGeneration: 3
    reason: UpdateRunStopped
    status: "False" # the updateRun has stopped progressing
    type: Progressing
  deletionStageStatus:
    clusters: [] # no clusters need to be cleaned up
    stageName: kubernetes-fleet.io/deleteStage
  policyObservedClusterCount: 3 # number of clusters to be updated
  policySnapshotIndexUsed: "0"
  resourceSnapshotIndexUsed: "1"
  stagedUpdateStrategySnapshot: # snapshot of the strategy used for this update run
    stages:
    - afterStageTasks:
      - type: TimedWait
        waitTime: 1m0s
      labelSelector:
        matchLabels:
          environment: staging
      maxConcurrency: 1
      name: staging
    - beforeStageTasks:
      - type: Approval
      labelSelector:
        matchLabels:
          environment: canary
      maxConcurrency: 50%
      name: canary
      sortingLabelKey: order
  stagesStatus: # detailed status for each stage
  - clusters:
    - clusterName: member2 # stage staging contains member2 cluster only
      conditions:
      - lastTransitionTime: "2025-07-22T21:28:08Z"
        message: Cluster update started
        observedGeneration: 3
        reason: ClusterUpdatingStarted
        status: "True"
        type: Started
      - lastTransitionTime: "2025-07-22T21:28:23Z"
        message: Cluster update completed successfully
        observedGeneration: 3
        reason: ClusterUpdatingSucceeded
        status: "True" # member2 is updated successfully
        type: Succeeded
    conditions:
    - lastTransitionTime: "2025-07-22T21:28:23Z"
      message: All the updating clusters have finished updating, the stage is now stopped, waiting to be resumed
      observedGeneration: 3
      reason: StageUpdatingStopped
      status: "False"
      type: Progressing
    stageName: staging
    startTime: "2025-07-22T21:28:08Z"

部署第二阶段的更新操作以回滚到先前的版本

假设工作负荷管理员想要回滚 ConfigMap 更改,将值 value2 还原为 value1。 他们可以在上下文中使用以前的资源快照索引“0”创建新的 ClusterStagedUpdateRun 配置映射,而不是从中心手动更新 ConfigMap,并且可以重复使用相同的策略:

apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateRun
metadata:
  name: example-run-2
spec:
  placementName: example-placement
  resourceSnapshotIndex: "0"
  stagedRolloutStrategyName: example-strategy
  state: Run

让我们检查一下新的 ClusterStagedUpdateRun

kubectl get clusterstagedupdaterun

输出应类似于以下示例:

NAME            PLACEMENT           RESOURCE-SNAPSHOT-INDEX   POLICY-SNAPSHOT-INDEX   INITIALIZED   SUCCEEDED   AGE
example-run     example-placement   1                         0                       True          True        13m
example-run-2   example-placement   0                         0                       True                      9s

一分钟 TimedWait 结束后,我们应该能看到为新 ClusterApprovalRequest 创建的 ClusterStagedUpdateRun 对象:

kubectl get clusterapprovalrequest

输出应类似于以下示例:

NAME                          UPDATE-RUN      STAGE    APPROVED   APPROVALACCEPTED   AGE
example-run-2-before-canary   example-run-2   canary                                 75s
example-run-before-canary     example-run     canary   True       True               14m

若要批准新 ClusterApprovalRequest 对象,让我们重复使用相同的 approval.json 文件来修补它:

kubectl patch clusterapprovalrequests example-run-2-before-canary --type='merge' --subresource=status --patch-file approval.json

验证新对象是否已获得批准:

kubectl get clusterapprovalrequest                                                                            

输出应类似于以下示例:

NAME                          UPDATE-RUN      STAGE    APPROVED   APPROVALACCEPTED   AGE
example-run-2-before-canary   example-run-2   canary   True       True               2m7s
example-run-before-canary     example-run     canary   True       True               15m

现在,应在所有三个成员群集上部署 ConfigMap test-cm ,并将数据还原为 value1

apiVersion: v1
data:
  key: value1
kind: ConfigMap
metadata:
  ...
  name: test-cm
  namespace: test-namespace
  ...

清理资源

完成本教程后,可以清理所创建的资源:

kubectl delete clusterstagedupdaterun example-run example-run-2
kubectl delete clusterstagedupdatestrategy example-strategy
kubectl delete clusterresourceplacement example-placement
kubectl delete namespace test-namespace

准备 Kubernetes 工作负载以供放置

将 Kubernetes 工作负载发布到中心群集,以便将其放置在成员群集上。

为中心群集上的工作负荷创建命名空间和 ConfigMap:

kubectl create namespace test-namespace
kubectl create configmap test-cm --from-literal=key=value1 -n test-namespace

由于 ResourcePlacement 是命名空间范围的,因此首先使用 ClusterResourcePlacement 部署命名空间到所有成员群集,并通过 NamespaceOnly 指定选择范围:

apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterResourcePlacement
metadata:
  name: test-namespace-placement
spec:
  resourceSelectors:
    - group: ""
      kind: Namespace
      name: test-namespace
      version: v1
      selectionScope: NamespaceOnly
  policy:
    placementType: PickAll

验证命名空间是否已部署到所有成员群集:

kubectl get clusterresourceplacement test-namespace-placement

输出应类似于以下示例:

NAME                       GEN   SCHEDULED   SCHEDULED-GEN   AVAILABLE   AVAILABLE-GEN   AGE
test-namespace-placement   1     True        1               True        1               30s

若要部署 ConfigMap,请创建命名空间范围 ResourcePlacement

注释

设置为spec.strategy.typeExternal允许通过 a StagedUpdateRun. 触发的推出。

apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ResourcePlacement
metadata:
  name: example-placement
  namespace: test-namespace
spec:
  resourceSelectors:
    - group: ""
      kind: ConfigMap
      name: test-cm
      version: v1
  policy:
    placementType: PickAll
  strategy:
    type: External

应计划这三个群集,因为我们使用 PickAll 策略,但不应在成员群集上部署 ConfigMap,因为我们尚未创建 StagedUpdateRun

验证放置是否计划:

kubectl get resourceplacement example-placement -n test-namespace

输出应类似于以下示例:

NAME                GEN   SCHEDULED   SCHEDULED-GEN   AVAILABLE   AVAILABLE-GEN   AGE
example-placement   1     True        1                                           51s

使用资源快照

当资源发生更改时,Fleet Manager 会创建资源快照。 每个快照都有一个唯一索引,可用于引用特定版本的资源。

小窍门

有关资源快照及其工作原理的详细信息,请参阅 了解资源快照

检查当前资源快照

检查当前资源快照:

kubectl get resourcesnapshots -n test-namespace --show-labels

输出应类似于以下示例:

NAME                           GEN   AGE   LABELS
example-placement-0-snapshot   1     60s   kubernetes-fleet.io/is-latest-snapshot=true,kubernetes-fleet.io/parent-CRP=example-placement,kubernetes-fleet.io/resource-index=0

我们只有一个版本的快照。 这是最新的 (kubernetes-fleet.io/is-latest-snapshot=true) 并具有资源索引 0 (kubernetes-fleet.io/resource-index=0)。

创建新的资源快照

现在,使用新值修改 ConfigMap:

kubectl edit configmap test-cm -n test-namespace

将值更新 value1value2

kubectl get configmap test-cm -n test-namespace -o yaml

输出应类似于以下示例:

apiVersion: v1
data:
  key: value2 # value updated here, old value: value1
kind: ConfigMap
metadata:
  creationTimestamp: ...
  name: test-cm
  namespace: test-namespace
  resourceVersion: ...
  uid: ...

现在,应分别看到两个版本包含索引 0 和 1 的资源快照:

kubectl get resourcesnapshots -n test-namespace --show-labels

输出应类似于以下示例:

NAME                           GEN   AGE    LABELS
example-placement-0-snapshot   1     2m6s   kubernetes-fleet.io/is-latest-snapshot=false,kubernetes-fleet.io/parent-CRP=example-placement,kubernetes-fleet.io/resource-index=0
example-placement-1-snapshot   1     10s    kubernetes-fleet.io/is-latest-snapshot=true,kubernetes-fleet.io/parent-CRP=example-placement,kubernetes-fleet.io/resource-index=1

最新标签设置为 example-placement-1-snapshot,其中包含最新的 ConfigMap 数据:

kubectl get resourcesnapshots example-placement-1-snapshot -n test-namespace -o yaml

输出应类似于以下示例:

apiVersion: placement.kubernetes-fleet.io/v1
kind: ResourceSnapshot
metadata:
  annotations:
    kubernetes-fleet.io/number-of-enveloped-object: "0"
    kubernetes-fleet.io/number-of-resource-snapshots: "1"
    kubernetes-fleet.io/resource-hash: 10dd7a3d1e5f9849afe956cfbac080a60671ad771e9bda7dd34415f867c75648
  creationTimestamp: "2025-07-22T21:26:54Z"
  generation: 1
  labels:
    kubernetes-fleet.io/is-latest-snapshot: "true"
    kubernetes-fleet.io/parent-CRP: example-placement
    kubernetes-fleet.io/resource-index: "1"
  name: example-placement-1-snapshot
  namespace: test-namespace
  ownerReferences:
  - apiVersion: placement.kubernetes-fleet.io/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: ResourcePlacement
    name: example-placement
    uid: e7d59513-b3b6-4904-864a-c70678fd6f65
  resourceVersion: "19994"
  uid: 79ca0bdc-0b0a-4c40-b136-7f701e85cdb6
spec:
  selectedResources:
  - apiVersion: v1
    data:
      key: value2 # latest value: value2, old value: value1
    kind: ConfigMap
    metadata:
      name: test-cm
      namespace: test-namespace

创建分阶段更新策略

定义 StagedUpdateStrategy 将群集分组到阶段并指定推出序列的业务流程模式。 它按标签选择成员群集。 对于我们的演示,我们将创建一个包含两个阶段的暂存和 Canary:

apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: StagedUpdateStrategy
metadata:
  name: example-strategy
  namespace: test-namespace
spec:
  stages:
    - name: staging
      labelSelector:
        matchLabels:
          environment: staging
      afterStageTasks:
        - type: TimedWait
          waitTime: 1m
      maxConcurrency: 1
    - name: canary
      labelSelector:
        matchLabels:
          environment: canary
      sortingLabelKey: order
      beforeStageTasks:
        - type: Approval
      maxConcurrency: 50%

准备分阶段更新运行以实施更改

执行StagedUpdateRun以下项ResourcePlacementStagedUpdateStrategy的推出。 若要触发 ResourcePlacement(RP)的暂存更新运行,我们将创建一个 StagedUpdateRun 指定 RP 名称、updateRun 策略名称、最新资源快照索引(“1”)以及状态为“Initialize”:

apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: StagedUpdateRun
metadata:
  name: example-run
  namespace: test-namespace
spec:
  placementName: example-placement
  resourceSnapshotIndex: "1"
  stagedRolloutStrategyName: example-strategy
  state: Initialize

分阶段更新运行已初始化,但未运行:

kubectl get stagedupdaterrun example-run -n test-namespace

输出应类似于以下示例:

NAME          PLACEMENT           RESOURCE-SNAPSHOT-INDEX   POLICY-SNAPSHOT-INDEX   INITIALIZED   SUCCEEDED   AGE
example-run   example-placement   1                         0                       True                      7s

启动分阶段更新运行

若要开始执行分阶段更新运行,需要将规范中的字段修改 stateRun

kubectl patch stagedupdaterrun example-run -n test-namespace --type merge -p '{"spec":{"state":"Run"}}'

注释

还可以创建一个更新运行,并将state字段最初设置为Run,这样可以在一个步骤中同时初始化并启动更新运行。

分阶段更新运行已初始化并运行:

kubectl get stagedupdaterrun example-run -n test-namespace

输出应类似于以下示例:

NAME          PLACEMENT           RESOURCE-SNAPSHOT-INDEX   POLICY-SNAPSHOT-INDEX   INITIALIZED   SUCCEEDED   AGE
example-run   example-placement   1                         0                       True                      7s

经过一分钟 TimedWait 后,检查审批请求:

kubectl get approvalrequests -n test-namespace

输出应类似于以下示例:

NAME                        STAGED-UPDATE-RUN   STAGE    APPROVED   APPROVALACCEPTED   AGE
example-run-before-canary   example-run         canary                                 2m39s

批准分阶段更新运行

可以通过创建 json 修补程序文件并应用它来批准 ApprovalRequest 该文件:

cat << EOF > approval.json
"status": {
    "conditions": [
        {
            "lastTransitionTime": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",
            "message": "lgtm",
            "observedGeneration": 1,
            "reason": "testPassed",
            "status": "True",
            "type": "Approved"
        }
    ]
}
EOF

提交修补程序请求,以使用创建的 JSON 文件进行批准。

kubectl patch approvalrequests example-run-before-canary -n test-namespace --type='merge' --subresource=status --patch-file approval.json

然后验证是否已批准请求:

kubectl get approvalrequests -n test-namespace

输出应类似于以下示例:

NAME                        STAGED-UPDATE-RUN   STAGE    APPROVED   APPROVALACCEPTED   AGE
example-run-before-canary   example-run         canary   True       True               3m35s

现在 StagedUpdateRun 能够继续并完成:

kubectl get stagedupdaterrun example-run -n test-namespace

输出应类似于以下示例:

NAME          PLACEMENT           RESOURCE-SNAPSHOT-INDEX   POLICY-SNAPSHOT-INDEX   INITIALIZED   SUCCEEDED   AGE
example-run   example-placement   1                         0                       True          True        5m28s

验证部署完成

此外 ResourcePlacement ,还显示已完成的推出和资源在所有成员群集上可用:

kubectl get resourceplacement example-placement -n test-namespace

输出应类似于以下示例:

NAME                GEN   SCHEDULED   SCHEDULED-GEN   AVAILABLE   AVAILABLE-GEN   AGE
example-placement   1     True        1               True        1               8m55s

应在所有三个成员群集中部署 ConfigMap test-cm,并且数据应为最新版本。

apiVersion: v1
data:
  key: value2
kind: ConfigMap
metadata:
  ...
  name: test-cm
  namespace: test-namespace
  ...

停止分阶段更新过程

若要停止执行正在运行的分阶段更新运行,需要将state规范中的字段更新为Stop。 该操作会优雅地暂停更新运行,让正在进行中的群集在整个发布过程停止之前完成更新。

kubectl patch stagedupdaterun example-run -n test-namespace --type merge -p '{"spec":{"state":"Stop"}}'

分阶段更新运行已初始化,不再运行:

kubectl get stagedupdaterun example-run -n test-namespace

输出应类似于以下示例:

NAME          PLACEMENT           RESOURCE-SNAPSHOT-INDEX   POLICY-SNAPSHOT-INDEX   INITIALIZED   SUCCEEDED   AGE
example-run   example-placement   1                         0                       True                      7s

更新完成后,可以更详细地查看状态:

kubectl get stagedupdaterun example-run -n test-namespace -o yaml

输出应类似于以下示例:

apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: StagedUpdateRun
metadata:
  ...
  name: example-run
  namespace: test-namespace
  ...
spec:
  placementName: example-placement
  resourceSnapshotIndex: "1"
  stagedRolloutStrategyName: example-strategy
  state: Stop
status:
  conditions:
  - lastTransitionTime: "2025-07-22T21:28:08Z"
    message: StagedUpdateRun initialized successfully
    observedGeneration: 3
    reason: UpdateRunInitializedSuccessfully
    status: "True" # the updateRun is initialized successfully
    type: Initialized
  - lastTransitionTime: "2025-07-22T21:28:23Z"
    message: The update run has been stopped
    observedGeneration: 3
    reason: UpdateRunStopped
    status: "False" # the updateRun has stopped progressing
    type: Progressing
  deletionStageStatus:
    clusters: [] # no clusters need to be cleaned up
    stageName: kubernetes-fleet.io/deleteStage
  policyObservedClusterCount: 3 # number of clusters to be updated
  policySnapshotIndexUsed: "0"
  resourceSnapshotIndexUsed: "1"
  stagedUpdateStrategySnapshot: # snapshot of the strategy used for this update run
    stages:
    - afterStageTasks:
      - type: TimedWait
        waitTime: 1m0s
      labelSelector:
        matchLabels:
          environment: staging
      maxConcurrency: 1
      name: staging
    - beforeStageTasks:
      - type: Approval
      labelSelector:
        matchLabels:
          environment: canary
      maxConcurrency: 50%
      name: canary
      sortingLabelKey: order
  stagesStatus: # detailed status for each stage
  - clusters:
    - clusterName: member2 # stage staging contains member2 cluster only
      conditions:
      - lastTransitionTime: "2025-07-22T21:28:08Z"
        message: Cluster update started
        observedGeneration: 3
        reason: ClusterUpdatingStarted
        status: "True"
        type: Started
      - lastTransitionTime: "2025-07-22T21:28:23Z"
        message: Cluster update completed successfully
        observedGeneration: 3
        reason: ClusterUpdatingSucceeded
        status: "True" # member2 is updated successfully
        type: Succeeded
    conditions:
    - lastTransitionTime: "2025-07-22T21:28:23Z"
      message: All the updating clusters have finished updating, the stage is now stopped, waiting to be resumed
      observedGeneration: 3
      reason: StageUpdatingStopped
      status: "False"
      type: Progressing
    stageName: staging
    startTime: "2025-07-22T21:28:08Z"

部署第二阶段的更新操作以回滚到先前的版本

假设工作负荷管理员想要回滚 ConfigMap 更改,将值 value2 还原为 value1。 他们可以在我们的上下文中使用以前的资源快照索引“0”创建新的 StagedUpdateRun,而不是从集线器手动更新配置映射,这样他们可以重复使用相同的策略。

apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: StagedUpdateRun
metadata:
  name: example-run-2
  namespace: test-namespace
spec:
  placementName: example-placement
  resourceSnapshotIndex: "0"
  stagedRolloutStrategyName: example-strategy
  state: Run

让我们检查一下新的 StagedUpdateRun

kubectl get stagedupdaterun -n test-namespace

输出应类似于以下示例:

NAME            PLACEMENT           RESOURCE-SNAPSHOT-INDEX   POLICY-SNAPSHOT-INDEX   INITIALIZED   SUCCEEDED   AGE
example-run     example-placement   1                         0                       True          True        13m
example-run-2   example-placement   0                         0                       True                      9s

一分钟 TimedWait 结束后,我们应该能看到为新 ApprovalRequest 创建的 StagedUpdateRun 对象:

kubectl get approvalrequests -n test-namespace

输出应类似于以下示例:

NAME                          STAGED-UPDATE-RUN   STAGE    APPROVED   APPROVALACCEPTED   AGE
example-run-2-before-canary   example-run-2       canary                                 75s
example-run-before-canary     example-run         canary   True       True               14m

若要批准新 ApprovalRequest 对象,让我们重复使用相同的 approval.json 文件来修补它:

kubectl patch approvalrequests example-run-2-before-canary -n test-namespace --type='merge' --subresource=status --patch-file approval.json

验证新对象是否已获得批准:

kubectl get approvalrequests -n test-namespace

输出应类似于以下示例:

NAME                          STAGED-UPDATE-RUN   STAGE    APPROVED   APPROVALACCEPTED   AGE
example-run-2-before-canary   example-run-2       canary   True       True               2m7s
example-run-before-canary     example-run         canary   True       True               15m

现在,应在所有三个成员群集上部署 ConfigMap test-cm ,并将数据还原为 value1

apiVersion: v1
data:
  key: value1
kind: ConfigMap
metadata:
  ...
  name: test-cm
  namespace: test-namespace
  ...

清理资源

完成本教程后,可以清理所创建的资源:

kubectl delete stagedupdaterun example-run example-run-2 -n test-namespace
kubectl delete stagedupdatestrategy example-strategy -n test-namespace
kubectl delete resourceplacement example-placement -n test-namespace
kubectl delete clusterresourceplacement test-namespace-placement
kubectl delete namespace test-namespace

方法之间的主要差异

方面 集群范围 Namespace-Scoped
策略资源 ClusterStagedUpdateStrategy (短名称: csus StagedUpdateStrategy (短名称: sus
更新运行资源 ClusterStagedUpdateRun (短名称: csur StagedUpdateRun (短名称: sur
目标放置 ClusterResourcePlacement (短名称: crp ResourcePlacement (短名称: rp
审批资源 ClusterApprovalRequest (短名称: careq ApprovalRequest (短名称: areq
快照资源 ClusterResourceSnapshot ResourceSnapshot
Scope 群集范围 命名空间绑定
用例 基础设施部署 应用程序部署
权限 群集管理员级别 命名空间级别

后续步骤

本文介绍了如何使用分阶段更新运行来协调成员群集的推出。 你为集群范围和命名空间范围的部署创建了暂存更新策略,执行了渐进式推出,并将部署回滚到以前的版本。

若要了解有关分阶段更新运行和相关概念的详细信息,请参阅以下资源: