适用于: ✔️具有中心群集的 Fleet Manager
Azure Kubernetes Fleet Manager 放置分阶段更新运行提供了一种受控的方法,通过分阶段流程在多个成员群集上部署 Kubernetes 工作负载。 为了最大程度地降低风险,此方法按顺序部署到目标群集,并在阶段之间提供可选的等待时间和审批入口。
本文介绍如何创建和执行暂存更新运行,以逐步部署工作负载,并在需要时回滚到以前的版本。
Azure Kubernetes Fleet Manager 支持两个分阶段更新的范围:
-
群集范围:
ClusterStagedUpdateRun和ClusterResourcePlacement用于管理基础设施级更改的集群管理员。 -
命名空间范围(预览版):应用程序团队在其特定命名空间内使用
StagedUpdateRunResourcePlacement来管理发布。
本文中的示例演示了使用选项卡的这两种方法。 选择与部署范围匹配的选项卡。
重要
ResourcePlacement 使用 placement.kubernetes-fleet.io/v1beta1 API 版本,目前以预览版提供。 本文中演示的某些功能(例如 StagedUpdateStrategy)也是 v1beta1 API 的一部分,在 v1 API 中不可用。
在您开始之前
先决条件
需要具有活动订阅的 Azure 帐户。 创建帐户。
若要了解本文中使用的概念和术语,请阅读 分阶段推出策略的概念概述。
需要安装 Azure CLI 2.58.0 或更高版本才能完成本文。 若要安装或升级,请参阅 安装 Azure CLI。
如果还没有 Kubernetes CLI (kubectl),可以使用以下命令安装它:
az aks install-cli需要
fleetAzure CLI 扩展。 可以通过运行以下命令来安装它:az extension add --name fleet运行命令
az extension update以更新到最新版本的扩展:az extension update --name fleet
配置演示环境
此演示在具有中心群集和三个成员群集的 Fleet Manager 上运行。 如果没有,请按照 快速入门 创建包含中心群集的机群管理器。 然后,将 Azure Kubernetes 服务 (AKS) 群集加入为成员。
本教程演示了使用具有以下标签的三个成员群集的演示机群环境的分阶段更新运行:
| 群集名称 | 标签 |
|---|---|
| member1 | environment=canary, order=2 |
| member2 | environment=staging |
| member3 | environment=canary, order=1 |
为了按环境对群集进行分组并控制每个阶段内的部署顺序,这些标签允许我们创建阶段。
准备 Kubernetes 工作负载以供放置
将 Kubernetes 工作负载发布到中心群集,以便将其放置在成员群集上。
为中心群集上的工作负荷创建命名空间和 ConfigMap:
kubectl create namespace test-namespace
kubectl create configmap test-cm --from-literal=key=value1 -n test-namespace
若要部署资源,请创建一个 ClusterResourcePlacement:
注释
设置为spec.strategy.typeExternal允许通过 a ClusterStagedUpdateRun. 触发的推出。
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterResourcePlacement
metadata:
name: example-placement
spec:
resourceSelectors:
- group: ""
kind: Namespace
name: test-namespace
version: v1
policy:
placementType: PickAll
strategy:
type: External
应计划这三个群集,因为我们使用 PickAll 策略,但尚未在成员群集上部署任何资源,因为我们没有创建 ClusterStagedUpdateRun资源。
验证放置是否计划:
kubectl get clusterresourceplacement example-placement
输出应类似于以下示例:
NAME GEN SCHEDULED SCHEDULED-GEN AVAILABLE AVAILABLE-GEN AGE
example-placement 1 True 1 51s
使用资源快照
当资源发生更改时,Fleet Manager 会创建资源快照。 每个快照都有一个唯一索引,可用于引用特定版本的资源。
小窍门
有关资源快照及其工作原理的详细信息,请参阅 了解资源快照。
检查当前资源快照
检查当前资源快照:
kubectl get clusterresourcesnapshots --show-labels
输出应类似于以下示例:
NAME GEN AGE LABELS
example-placement-0-snapshot 1 60s kubernetes-fleet.io/is-latest-snapshot=true,kubernetes-fleet.io/parent-CRP=example-placement,kubernetes-fleet.io/resource-index=0
我们只有一个版本的快照。 这是最新的 (kubernetes-fleet.io/is-latest-snapshot=true) 并具有资源索引 0 (kubernetes-fleet.io/resource-index=0)。
创建新的资源快照
现在,使用新值修改 ConfigMap:
kubectl edit configmap test-cm -n test-namespace
将值更新 value1 为 value2:
kubectl get configmap test-cm -n test-namespace -o yaml
输出应类似于以下示例:
apiVersion: v1
data:
key: value2 # value updated here, old value: value1
kind: ConfigMap
metadata:
creationTimestamp: ...
name: test-cm
namespace: test-namespace
resourceVersion: ...
uid: ...
现在,应会看到两个版本的资源快照,其索引分别为 0 和 1,最新为索引 1:
kubectl get clusterresourcesnapshots --show-labels
输出应类似于以下示例:
NAME GEN AGE LABELS
example-placement-0-snapshot 1 2m6s kubernetes-fleet.io/is-latest-snapshot=false,kubernetes-fleet.io/parent-CRP=example-placement,kubernetes-fleet.io/resource-index=0
example-placement-1-snapshot 1 10s kubernetes-fleet.io/is-latest-snapshot=true,kubernetes-fleet.io/parent-CRP=example-placement,kubernetes-fleet.io/resource-index=1
最新标签设置为 example-placement-1-snapshot,其中包含最新的 ConfigMap 数据:
kubectl get clusterresourcesnapshots example-placement-1-snapshot -o yaml
输出应类似于以下示例:
apiVersion: placement.kubernetes-fleet.io/v1
kind: ClusterResourceSnapshot
metadata:
annotations:
kubernetes-fleet.io/number-of-enveloped-object: "0"
kubernetes-fleet.io/number-of-resource-snapshots: "1"
kubernetes-fleet.io/resource-hash: 10dd7a3d1e5f9849afe956cfbac080a60671ad771e9bda7dd34415f867c75648
creationTimestamp: "2025-07-22T21:26:54Z"
generation: 1
labels:
kubernetes-fleet.io/is-latest-snapshot: "true"
kubernetes-fleet.io/parent-CRP: example-placement
kubernetes-fleet.io/resource-index: "1"
name: example-placement-1-snapshot
ownerReferences:
- apiVersion: placement.kubernetes-fleet.io/v1beta1
blockOwnerDeletion: true
controller: true
kind: ClusterResourcePlacement
name: example-placement
uid: e7d59513-b3b6-4904-864a-c70678fd6f65
resourceVersion: "19994"
uid: 79ca0bdc-0b0a-4c40-b136-7f701e85cdb6
spec:
selectedResources:
- apiVersion: v1
kind: Namespace
metadata:
labels:
kubernetes.io/metadata.name: test-namespace
name: test-namespace
spec:
finalizers:
- kubernetes
- apiVersion: v1
data:
key: value2 # latest value: value2, old value: value1
kind: ConfigMap
metadata:
name: test-cm
namespace: test-namespace
创建分阶段更新策略
定义 ClusterStagedUpdateStrategy 将群集分组到阶段并指定推出序列的业务流程模式。 它按标签选择成员群集。 对于我们的演示,我们将创建一个包含两个阶段的暂存和 Canary:
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateStrategy
metadata:
name: example-strategy
spec:
stages:
- name: staging
labelSelector:
matchLabels:
environment: staging
afterStageTasks:
- type: TimedWait
waitTime: 1m
maxConcurrency: 1
- name: canary
labelSelector:
matchLabels:
environment: canary
sortingLabelKey: order
beforeStageTasks:
- type: Approval
maxConcurrency: 50%
准备分阶段更新运行以实施更改
执行ClusterStagedUpdateRun以下项ClusterResourcePlacementClusterStagedUpdateStrategy的推出。 若要触发 ClusterResourcePlacement(CRP)的暂存更新运行,我们将创建一个 ClusterStagedUpdateRun 指定 CRP 名称、updateRun 策略名称、最新资源快照索引(“1”)和状态为“Initialize”:
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateRun
metadata:
name: example-run
spec:
placementName: example-placement
resourceSnapshotIndex: "1"
stagedRolloutStrategyName: example-strategy
state: Initialize
分阶段更新运行已初始化,但未运行:
kubectl get clusterstagedupdaterrun example-run
输出应类似于以下示例:
NAME PLACEMENT RESOURCE-SNAPSHOT-INDEX POLICY-SNAPSHOT-INDEX INITIALIZED SUCCEEDED AGE
example-run example-placement 1 0 True 7s
启动分阶段更新运行
若要开始执行分阶段更新运行,需要将规范中的字段修改 state 为 Run:
kubectl patch clusterstagedupdaterrun example-run --type merge -p '{"spec":{"state":"Run"}}'
注释
还可以创建一个更新运行,并将state字段最初设置为Run,这样可以在一个步骤中同时初始化并启动更新运行。
分阶段更新运行已初始化并运行:
kubectl get clusterstagedupdaterrun example-run
输出应类似于以下示例:
NAME PLACEMENT RESOURCE-SNAPSHOT-INDEX POLICY-SNAPSHOT-INDEX INITIALIZED SUCCEEDED AGE
example-run example-placement 1 0 True 7s
更详细地查看一分钟 TimedWait 后的状态:
kubectl get clusterstagedupdaterrun example-run -o yaml
输出应类似于以下示例:
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateRun
metadata:
...
name: example-run
...
spec:
placementName: example-placement
resourceSnapshotIndex: "1"
stagedRolloutStrategyName: example-strategy
state: Run
status:
conditions:
- lastTransitionTime: "2025-07-22T21:28:08Z"
message: ClusterStagedUpdateRun initialized successfully
observedGeneration: 2
reason: UpdateRunInitializedSuccessfully
status: "True" # the updateRun is initialized successfully
type: Initialized
- lastTransitionTime: "2025-07-22T21:29:53Z"
message: The updateRun is waiting for after-stage tasks in stage canary to complete
observedGeneration: 2
reason: UpdateRunWaiting
status: "False" # the updateRun is still progressing and waiting for approval
type: Progressing
deletionStageStatus:
clusters: [] # no clusters need to be cleaned up
stageName: kubernetes-fleet.io/deleteStage
policyObservedClusterCount: 3 # number of clusters to be updated
policySnapshotIndexUsed: "0"
resourceSnapshotIndexUsed: "1"
stagedUpdateStrategySnapshot: # snapshot of the strategy used for this update run
stages:
- afterStageTasks:
- type: TimedWait
waitTime: 1m0s
labelSelector:
matchLabels:
environment: staging
maxConcurrency: 1
name: staging
- beforeStageTasks:
- type: Approval
labelSelector:
matchLabels:
environment: canary
maxConcurrency: 50%
name: canary
sortingLabelKey: order
stagesStatus: # detailed status for each stage
- afterStageTaskStatus:
- conditions:
- lastTransitionTime: "2025-07-22T21:29:23Z"
message: Wait time elapsed
observedGeneration: 2
reason: StageTaskWaitTimeElapsed
status: "True" # the wait after-stage task has completed
type: WaitTimeElapsed
type: TimedWait
clusters:
- clusterName: member2 # stage staging contains member2 cluster only
conditions:
- lastTransitionTime: "2025-07-22T21:28:08Z"
message: Cluster update started
observedGeneration: 2
reason: ClusterUpdatingStarted
status: "True"
type: Started
- lastTransitionTime: "2025-07-22T21:28:23Z"
message: Cluster update completed successfully
observedGeneration: 2
reason: ClusterUpdatingSucceeded
status: "True" # member2 is updated successfully
type: Succeeded
conditions:
- lastTransitionTime: "2025-07-22T21:28:23Z"
message: All clusters in the stage are updated and after-stage tasks are completed
observedGeneration: 2
reason: StageUpdatingSucceeded
status: "False"
type: Progressing
- lastTransitionTime: "2025-07-22T21:29:23Z"
message: Stage update completed successfully
observedGeneration: 2
reason: StageUpdatingSucceeded
status: "True" # stage staging has completed successfully
type: Succeeded
endTime: "2025-07-22T21:29:23Z"
stageName: staging
startTime: "2025-07-22T21:28:08Z"
- beforeStageTaskStatus:
- approvalRequestName: example-run-before-canary # ClusterApprovalRequest name for this stage for before stage task
conditions:
- lastTransitionTime: "2025-07-22T21:29:53Z"
message: ClusterApprovalRequest is created
observedGeneration: 2
reason: StageTaskApprovalRequestCreated
status: "True"
type: ApprovalRequestCreated
type: Approval
conditions:
- lastTransitionTime: "2025-07-22T21:29:53Z"
message: Not all before-stage tasks are completed, waiting for approval
observedGeneration: 2
reason: StageUpdatingWaiting
status: "False" # stage canary is waiting for approval task completion
type: Progressing
stageName: canary
startTime: "2025-07-22T21:29:23Z"
我们可以看到暂存阶段的 TimedWait 时间段已结束,我们还会看到 Canary 阶段中审批任务的对象已创建。 我们可以检查生成的 ClusterApprovalRequest,并确认还没有人批准它。
kubectl get clusterapprovalrequest
输出应类似于以下示例:
NAME UPDATE-RUN STAGE APPROVED APPROVALACCEPTED AGE
example-run-before-canary example-run canary 2m39s
批准分阶段更新运行
可以通过创建 json 修补程序文件并应用它来批准 ClusterApprovalRequest 该文件:
cat << EOF > approval.json
"status": {
"conditions": [
{
"lastTransitionTime": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",
"message": "lgtm",
"observedGeneration": 1,
"reason": "testPassed",
"status": "True",
"type": "Approved"
}
]
}
EOF
提交修补程序请求,以使用创建的 JSON 文件进行批准。
kubectl patch clusterapprovalrequests example-run-before-canary --type='merge' --subresource=status --patch-file approval.json
然后验证是否已批准请求:
kubectl get clusterapprovalrequest
输出应类似于以下示例:
NAME UPDATE-RUN STAGE APPROVED APPROVALACCEPTED AGE
example-run-before-canary example-run canary True True 3m35s
现在 ClusterStagedUpdateRun 能够继续并完成:
kubectl get clusterstagedupdaterrun example-run
输出应类似于以下示例:
NAME PLACEMENT RESOURCE-SNAPSHOT-INDEX POLICY-SNAPSHOT-INDEX INITIALIZED SUCCEEDED AGE
example-run example-placement 1 0 True True 5m28s
验证部署完成
此外 ClusterResourcePlacement ,还显示已完成的推出和资源在所有成员群集上可用:
kubectl get clusterresourceplacement example-placement
输出应类似于以下示例:
NAME GEN SCHEDULED SCHEDULED-GEN AVAILABLE AVAILABLE-GEN AGE
example-placement 1 True 1 True 1 8m55s
应在所有三个成员群集中部署 ConfigMap test-cm,并且数据应为最新版本。
apiVersion: v1
data:
key: value2
kind: ConfigMap
metadata:
...
name: test-cm
namespace: test-namespace
...
停止分阶段更新过程
若要停止正在运行的群集暂存更新的运行,需要将规范中的state字段修补为Stop。 该操作会优雅地暂停更新运行,让正在进行中的群集在整个发布过程停止之前完成更新。
kubectl patch clusterstagedupdaterun example-run --type merge -p '{"spec":{"state":"Stop"}}'
分阶段更新运行已初始化,不再运行:
kubectl get clusterstagedupdaterun example-run
输出应类似于以下示例:
NAME PLACEMENT RESOURCE-SNAPSHOT-INDEX POLICY-SNAPSHOT-INDEX INITIALIZED SUCCEEDED AGE
example-run example-placement 1 0 True 7s
更新完成后,可以更详细地查看状态:
kubectl get clusterstagedupdaterun example-run -o yaml
输出应类似于以下示例:
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateRun
metadata:
...
name: example-run
...
spec:
placementName: example-placement
resourceSnapshotIndex: "1"
stagedRolloutStrategyName: example-strategy
state: Stop
status:
conditions:
- lastTransitionTime: "2025-07-22T21:28:08Z"
message: ClusterStagedUpdateRun initialized successfully
observedGeneration: 3
reason: UpdateRunInitializedSuccessfully
status: "True" # the updateRun is initialized successfully
type: Initialized
- lastTransitionTime: "2025-07-22T21:28:23Z"
message: The update run has been stopped
observedGeneration: 3
reason: UpdateRunStopped
status: "False" # the updateRun has stopped progressing
type: Progressing
deletionStageStatus:
clusters: [] # no clusters need to be cleaned up
stageName: kubernetes-fleet.io/deleteStage
policyObservedClusterCount: 3 # number of clusters to be updated
policySnapshotIndexUsed: "0"
resourceSnapshotIndexUsed: "1"
stagedUpdateStrategySnapshot: # snapshot of the strategy used for this update run
stages:
- afterStageTasks:
- type: TimedWait
waitTime: 1m0s
labelSelector:
matchLabels:
environment: staging
maxConcurrency: 1
name: staging
- beforeStageTasks:
- type: Approval
labelSelector:
matchLabels:
environment: canary
maxConcurrency: 50%
name: canary
sortingLabelKey: order
stagesStatus: # detailed status for each stage
- clusters:
- clusterName: member2 # stage staging contains member2 cluster only
conditions:
- lastTransitionTime: "2025-07-22T21:28:08Z"
message: Cluster update started
observedGeneration: 3
reason: ClusterUpdatingStarted
status: "True"
type: Started
- lastTransitionTime: "2025-07-22T21:28:23Z"
message: Cluster update completed successfully
observedGeneration: 3
reason: ClusterUpdatingSucceeded
status: "True" # member2 is updated successfully
type: Succeeded
conditions:
- lastTransitionTime: "2025-07-22T21:28:23Z"
message: All the updating clusters have finished updating, the stage is now stopped, waiting to be resumed
observedGeneration: 3
reason: StageUpdatingStopped
status: "False"
type: Progressing
stageName: staging
startTime: "2025-07-22T21:28:08Z"
部署第二阶段的更新操作以回滚到先前的版本
假设工作负荷管理员想要回滚 ConfigMap 更改,将值 value2 还原为 value1。 他们可以在上下文中使用以前的资源快照索引“0”创建新的 ClusterStagedUpdateRun 配置映射,而不是从中心手动更新 ConfigMap,并且可以重复使用相同的策略:
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateRun
metadata:
name: example-run-2
spec:
placementName: example-placement
resourceSnapshotIndex: "0"
stagedRolloutStrategyName: example-strategy
state: Run
让我们检查一下新的 ClusterStagedUpdateRun:
kubectl get clusterstagedupdaterun
输出应类似于以下示例:
NAME PLACEMENT RESOURCE-SNAPSHOT-INDEX POLICY-SNAPSHOT-INDEX INITIALIZED SUCCEEDED AGE
example-run example-placement 1 0 True True 13m
example-run-2 example-placement 0 0 True 9s
一分钟 TimedWait 结束后,我们应该能看到为新 ClusterApprovalRequest 创建的 ClusterStagedUpdateRun 对象:
kubectl get clusterapprovalrequest
输出应类似于以下示例:
NAME UPDATE-RUN STAGE APPROVED APPROVALACCEPTED AGE
example-run-2-before-canary example-run-2 canary 75s
example-run-before-canary example-run canary True True 14m
若要批准新 ClusterApprovalRequest 对象,让我们重复使用相同的 approval.json 文件来修补它:
kubectl patch clusterapprovalrequests example-run-2-before-canary --type='merge' --subresource=status --patch-file approval.json
验证新对象是否已获得批准:
kubectl get clusterapprovalrequest
输出应类似于以下示例:
NAME UPDATE-RUN STAGE APPROVED APPROVALACCEPTED AGE
example-run-2-before-canary example-run-2 canary True True 2m7s
example-run-before-canary example-run canary True True 15m
现在,应在所有三个成员群集上部署 ConfigMap test-cm ,并将数据还原为 value1:
apiVersion: v1
data:
key: value1
kind: ConfigMap
metadata:
...
name: test-cm
namespace: test-namespace
...
清理资源
完成本教程后,可以清理所创建的资源:
kubectl delete clusterstagedupdaterun example-run example-run-2
kubectl delete clusterstagedupdatestrategy example-strategy
kubectl delete clusterresourceplacement example-placement
kubectl delete namespace test-namespace
准备 Kubernetes 工作负载以供放置
将 Kubernetes 工作负载发布到中心群集,以便将其放置在成员群集上。
为中心群集上的工作负荷创建命名空间和 ConfigMap:
kubectl create namespace test-namespace
kubectl create configmap test-cm --from-literal=key=value1 -n test-namespace
由于 ResourcePlacement 是命名空间范围的,因此首先使用 ClusterResourcePlacement 部署命名空间到所有成员群集,并通过 NamespaceOnly 指定选择范围:
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterResourcePlacement
metadata:
name: test-namespace-placement
spec:
resourceSelectors:
- group: ""
kind: Namespace
name: test-namespace
version: v1
selectionScope: NamespaceOnly
policy:
placementType: PickAll
验证命名空间是否已部署到所有成员群集:
kubectl get clusterresourceplacement test-namespace-placement
输出应类似于以下示例:
NAME GEN SCHEDULED SCHEDULED-GEN AVAILABLE AVAILABLE-GEN AGE
test-namespace-placement 1 True 1 True 1 30s
若要部署 ConfigMap,请创建命名空间范围 ResourcePlacement:
注释
设置为spec.strategy.typeExternal允许通过 a StagedUpdateRun. 触发的推出。
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ResourcePlacement
metadata:
name: example-placement
namespace: test-namespace
spec:
resourceSelectors:
- group: ""
kind: ConfigMap
name: test-cm
version: v1
policy:
placementType: PickAll
strategy:
type: External
应计划这三个群集,因为我们使用 PickAll 策略,但不应在成员群集上部署 ConfigMap,因为我们尚未创建 StagedUpdateRun。
验证放置是否计划:
kubectl get resourceplacement example-placement -n test-namespace
输出应类似于以下示例:
NAME GEN SCHEDULED SCHEDULED-GEN AVAILABLE AVAILABLE-GEN AGE
example-placement 1 True 1 51s
使用资源快照
当资源发生更改时,Fleet Manager 会创建资源快照。 每个快照都有一个唯一索引,可用于引用特定版本的资源。
小窍门
有关资源快照及其工作原理的详细信息,请参阅 了解资源快照。
检查当前资源快照
检查当前资源快照:
kubectl get resourcesnapshots -n test-namespace --show-labels
输出应类似于以下示例:
NAME GEN AGE LABELS
example-placement-0-snapshot 1 60s kubernetes-fleet.io/is-latest-snapshot=true,kubernetes-fleet.io/parent-CRP=example-placement,kubernetes-fleet.io/resource-index=0
我们只有一个版本的快照。 这是最新的 (kubernetes-fleet.io/is-latest-snapshot=true) 并具有资源索引 0 (kubernetes-fleet.io/resource-index=0)。
创建新的资源快照
现在,使用新值修改 ConfigMap:
kubectl edit configmap test-cm -n test-namespace
将值更新 value1 为 value2:
kubectl get configmap test-cm -n test-namespace -o yaml
输出应类似于以下示例:
apiVersion: v1
data:
key: value2 # value updated here, old value: value1
kind: ConfigMap
metadata:
creationTimestamp: ...
name: test-cm
namespace: test-namespace
resourceVersion: ...
uid: ...
现在,应分别看到两个版本包含索引 0 和 1 的资源快照:
kubectl get resourcesnapshots -n test-namespace --show-labels
输出应类似于以下示例:
NAME GEN AGE LABELS
example-placement-0-snapshot 1 2m6s kubernetes-fleet.io/is-latest-snapshot=false,kubernetes-fleet.io/parent-CRP=example-placement,kubernetes-fleet.io/resource-index=0
example-placement-1-snapshot 1 10s kubernetes-fleet.io/is-latest-snapshot=true,kubernetes-fleet.io/parent-CRP=example-placement,kubernetes-fleet.io/resource-index=1
最新标签设置为 example-placement-1-snapshot,其中包含最新的 ConfigMap 数据:
kubectl get resourcesnapshots example-placement-1-snapshot -n test-namespace -o yaml
输出应类似于以下示例:
apiVersion: placement.kubernetes-fleet.io/v1
kind: ResourceSnapshot
metadata:
annotations:
kubernetes-fleet.io/number-of-enveloped-object: "0"
kubernetes-fleet.io/number-of-resource-snapshots: "1"
kubernetes-fleet.io/resource-hash: 10dd7a3d1e5f9849afe956cfbac080a60671ad771e9bda7dd34415f867c75648
creationTimestamp: "2025-07-22T21:26:54Z"
generation: 1
labels:
kubernetes-fleet.io/is-latest-snapshot: "true"
kubernetes-fleet.io/parent-CRP: example-placement
kubernetes-fleet.io/resource-index: "1"
name: example-placement-1-snapshot
namespace: test-namespace
ownerReferences:
- apiVersion: placement.kubernetes-fleet.io/v1beta1
blockOwnerDeletion: true
controller: true
kind: ResourcePlacement
name: example-placement
uid: e7d59513-b3b6-4904-864a-c70678fd6f65
resourceVersion: "19994"
uid: 79ca0bdc-0b0a-4c40-b136-7f701e85cdb6
spec:
selectedResources:
- apiVersion: v1
data:
key: value2 # latest value: value2, old value: value1
kind: ConfigMap
metadata:
name: test-cm
namespace: test-namespace
创建分阶段更新策略
定义 StagedUpdateStrategy 将群集分组到阶段并指定推出序列的业务流程模式。 它按标签选择成员群集。 对于我们的演示,我们将创建一个包含两个阶段的暂存和 Canary:
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: StagedUpdateStrategy
metadata:
name: example-strategy
namespace: test-namespace
spec:
stages:
- name: staging
labelSelector:
matchLabels:
environment: staging
afterStageTasks:
- type: TimedWait
waitTime: 1m
maxConcurrency: 1
- name: canary
labelSelector:
matchLabels:
environment: canary
sortingLabelKey: order
beforeStageTasks:
- type: Approval
maxConcurrency: 50%
准备分阶段更新运行以实施更改
执行StagedUpdateRun以下项ResourcePlacementStagedUpdateStrategy的推出。 若要触发 ResourcePlacement(RP)的暂存更新运行,我们将创建一个 StagedUpdateRun 指定 RP 名称、updateRun 策略名称、最新资源快照索引(“1”)以及状态为“Initialize”:
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: StagedUpdateRun
metadata:
name: example-run
namespace: test-namespace
spec:
placementName: example-placement
resourceSnapshotIndex: "1"
stagedRolloutStrategyName: example-strategy
state: Initialize
分阶段更新运行已初始化,但未运行:
kubectl get stagedupdaterrun example-run -n test-namespace
输出应类似于以下示例:
NAME PLACEMENT RESOURCE-SNAPSHOT-INDEX POLICY-SNAPSHOT-INDEX INITIALIZED SUCCEEDED AGE
example-run example-placement 1 0 True 7s
启动分阶段更新运行
若要开始执行分阶段更新运行,需要将规范中的字段修改 state 为 Run:
kubectl patch stagedupdaterrun example-run -n test-namespace --type merge -p '{"spec":{"state":"Run"}}'
注释
还可以创建一个更新运行,并将state字段最初设置为Run,这样可以在一个步骤中同时初始化并启动更新运行。
分阶段更新运行已初始化并运行:
kubectl get stagedupdaterrun example-run -n test-namespace
输出应类似于以下示例:
NAME PLACEMENT RESOURCE-SNAPSHOT-INDEX POLICY-SNAPSHOT-INDEX INITIALIZED SUCCEEDED AGE
example-run example-placement 1 0 True 7s
经过一分钟 TimedWait 后,检查审批请求:
kubectl get approvalrequests -n test-namespace
输出应类似于以下示例:
NAME STAGED-UPDATE-RUN STAGE APPROVED APPROVALACCEPTED AGE
example-run-before-canary example-run canary 2m39s
批准分阶段更新运行
可以通过创建 json 修补程序文件并应用它来批准 ApprovalRequest 该文件:
cat << EOF > approval.json
"status": {
"conditions": [
{
"lastTransitionTime": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",
"message": "lgtm",
"observedGeneration": 1,
"reason": "testPassed",
"status": "True",
"type": "Approved"
}
]
}
EOF
提交修补程序请求,以使用创建的 JSON 文件进行批准。
kubectl patch approvalrequests example-run-before-canary -n test-namespace --type='merge' --subresource=status --patch-file approval.json
然后验证是否已批准请求:
kubectl get approvalrequests -n test-namespace
输出应类似于以下示例:
NAME STAGED-UPDATE-RUN STAGE APPROVED APPROVALACCEPTED AGE
example-run-before-canary example-run canary True True 3m35s
现在 StagedUpdateRun 能够继续并完成:
kubectl get stagedupdaterrun example-run -n test-namespace
输出应类似于以下示例:
NAME PLACEMENT RESOURCE-SNAPSHOT-INDEX POLICY-SNAPSHOT-INDEX INITIALIZED SUCCEEDED AGE
example-run example-placement 1 0 True True 5m28s
验证部署完成
此外 ResourcePlacement ,还显示已完成的推出和资源在所有成员群集上可用:
kubectl get resourceplacement example-placement -n test-namespace
输出应类似于以下示例:
NAME GEN SCHEDULED SCHEDULED-GEN AVAILABLE AVAILABLE-GEN AGE
example-placement 1 True 1 True 1 8m55s
应在所有三个成员群集中部署 ConfigMap test-cm,并且数据应为最新版本。
apiVersion: v1
data:
key: value2
kind: ConfigMap
metadata:
...
name: test-cm
namespace: test-namespace
...
停止分阶段更新过程
若要停止执行正在运行的分阶段更新运行,需要将state规范中的字段更新为Stop。 该操作会优雅地暂停更新运行,让正在进行中的群集在整个发布过程停止之前完成更新。
kubectl patch stagedupdaterun example-run -n test-namespace --type merge -p '{"spec":{"state":"Stop"}}'
分阶段更新运行已初始化,不再运行:
kubectl get stagedupdaterun example-run -n test-namespace
输出应类似于以下示例:
NAME PLACEMENT RESOURCE-SNAPSHOT-INDEX POLICY-SNAPSHOT-INDEX INITIALIZED SUCCEEDED AGE
example-run example-placement 1 0 True 7s
更新完成后,可以更详细地查看状态:
kubectl get stagedupdaterun example-run -n test-namespace -o yaml
输出应类似于以下示例:
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: StagedUpdateRun
metadata:
...
name: example-run
namespace: test-namespace
...
spec:
placementName: example-placement
resourceSnapshotIndex: "1"
stagedRolloutStrategyName: example-strategy
state: Stop
status:
conditions:
- lastTransitionTime: "2025-07-22T21:28:08Z"
message: StagedUpdateRun initialized successfully
observedGeneration: 3
reason: UpdateRunInitializedSuccessfully
status: "True" # the updateRun is initialized successfully
type: Initialized
- lastTransitionTime: "2025-07-22T21:28:23Z"
message: The update run has been stopped
observedGeneration: 3
reason: UpdateRunStopped
status: "False" # the updateRun has stopped progressing
type: Progressing
deletionStageStatus:
clusters: [] # no clusters need to be cleaned up
stageName: kubernetes-fleet.io/deleteStage
policyObservedClusterCount: 3 # number of clusters to be updated
policySnapshotIndexUsed: "0"
resourceSnapshotIndexUsed: "1"
stagedUpdateStrategySnapshot: # snapshot of the strategy used for this update run
stages:
- afterStageTasks:
- type: TimedWait
waitTime: 1m0s
labelSelector:
matchLabels:
environment: staging
maxConcurrency: 1
name: staging
- beforeStageTasks:
- type: Approval
labelSelector:
matchLabels:
environment: canary
maxConcurrency: 50%
name: canary
sortingLabelKey: order
stagesStatus: # detailed status for each stage
- clusters:
- clusterName: member2 # stage staging contains member2 cluster only
conditions:
- lastTransitionTime: "2025-07-22T21:28:08Z"
message: Cluster update started
observedGeneration: 3
reason: ClusterUpdatingStarted
status: "True"
type: Started
- lastTransitionTime: "2025-07-22T21:28:23Z"
message: Cluster update completed successfully
observedGeneration: 3
reason: ClusterUpdatingSucceeded
status: "True" # member2 is updated successfully
type: Succeeded
conditions:
- lastTransitionTime: "2025-07-22T21:28:23Z"
message: All the updating clusters have finished updating, the stage is now stopped, waiting to be resumed
observedGeneration: 3
reason: StageUpdatingStopped
status: "False"
type: Progressing
stageName: staging
startTime: "2025-07-22T21:28:08Z"
部署第二阶段的更新操作以回滚到先前的版本
假设工作负荷管理员想要回滚 ConfigMap 更改,将值 value2 还原为 value1。 他们可以在我们的上下文中使用以前的资源快照索引“0”创建新的 StagedUpdateRun,而不是从集线器手动更新配置映射,这样他们可以重复使用相同的策略。
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: StagedUpdateRun
metadata:
name: example-run-2
namespace: test-namespace
spec:
placementName: example-placement
resourceSnapshotIndex: "0"
stagedRolloutStrategyName: example-strategy
state: Run
让我们检查一下新的 StagedUpdateRun:
kubectl get stagedupdaterun -n test-namespace
输出应类似于以下示例:
NAME PLACEMENT RESOURCE-SNAPSHOT-INDEX POLICY-SNAPSHOT-INDEX INITIALIZED SUCCEEDED AGE
example-run example-placement 1 0 True True 13m
example-run-2 example-placement 0 0 True 9s
一分钟 TimedWait 结束后,我们应该能看到为新 ApprovalRequest 创建的 StagedUpdateRun 对象:
kubectl get approvalrequests -n test-namespace
输出应类似于以下示例:
NAME STAGED-UPDATE-RUN STAGE APPROVED APPROVALACCEPTED AGE
example-run-2-before-canary example-run-2 canary 75s
example-run-before-canary example-run canary True True 14m
若要批准新 ApprovalRequest 对象,让我们重复使用相同的 approval.json 文件来修补它:
kubectl patch approvalrequests example-run-2-before-canary -n test-namespace --type='merge' --subresource=status --patch-file approval.json
验证新对象是否已获得批准:
kubectl get approvalrequests -n test-namespace
输出应类似于以下示例:
NAME STAGED-UPDATE-RUN STAGE APPROVED APPROVALACCEPTED AGE
example-run-2-before-canary example-run-2 canary True True 2m7s
example-run-before-canary example-run canary True True 15m
现在,应在所有三个成员群集上部署 ConfigMap test-cm ,并将数据还原为 value1:
apiVersion: v1
data:
key: value1
kind: ConfigMap
metadata:
...
name: test-cm
namespace: test-namespace
...
清理资源
完成本教程后,可以清理所创建的资源:
kubectl delete stagedupdaterun example-run example-run-2 -n test-namespace
kubectl delete stagedupdatestrategy example-strategy -n test-namespace
kubectl delete resourceplacement example-placement -n test-namespace
kubectl delete clusterresourceplacement test-namespace-placement
kubectl delete namespace test-namespace
方法之间的主要差异
| 方面 | 集群范围 | Namespace-Scoped |
|---|---|---|
| 策略资源 |
ClusterStagedUpdateStrategy (短名称: csus) |
StagedUpdateStrategy (短名称: sus) |
| 更新运行资源 |
ClusterStagedUpdateRun (短名称: csur) |
StagedUpdateRun (短名称: sur) |
| 目标放置 |
ClusterResourcePlacement (短名称: crp) |
ResourcePlacement (短名称: rp) |
| 审批资源 |
ClusterApprovalRequest (短名称: careq) |
ApprovalRequest (短名称: areq) |
| 快照资源 | ClusterResourceSnapshot |
ResourceSnapshot |
| Scope | 群集范围 | 命名空间绑定 |
| 用例 | 基础设施部署 | 应用程序部署 |
| 权限 | 群集管理员级别 | 命名空间级别 |
后续步骤
本文介绍了如何使用分阶段更新运行来协调成员群集的推出。 你为集群范围和命名空间范围的部署创建了暂存更新策略,执行了渐进式推出,并将部署回滚到以前的版本。
若要了解有关分阶段更新运行和相关概念的详细信息,请参阅以下资源: