使用 Azure Kubernetes Fleet Manager 时,了解 (CRP) 资源的状态 ClusterResourcePlacement 对于监视部署进度和排查问题至关重要。 本文提供了一个全面的指南,用于解释 Fleet Manager 针对群集范围和命名空间范围的放置报告的状态字段和条件。
先决条件
- 你有一个具有中心群集和一个或多个成员群集的机群管理器。 如果没有,请参阅 创建 Azure Kubernetes Fleet Manager 资源并加入成员群集。
- 有权访问 Fleet Manager 中心群集。 有关详细信息,请参阅 访问 Azure Kubernetes Fleet Manager 中心群集的 Kubernetes API。
- 你已部署了至少一个 ClusterResourcePlacement API 对象,用于将资源放入群中。 如果没有,请参阅 “使用群集资源放置”跨多个群集部署工作负荷。
放置状态结构的概述
对象 ClusterResourcePlacement 不仅包含有关放置的描述性规范,还包含放置作的状态。
状态部分提供有关以下内容的详细信息:
- 通过条件表示的总体放置状态
- 位置选择的资源
- 按群集放置状态通过条件表示
- 
              Failed、drifted和diffed每个群集中的位置
若要查看放置状态,请使用以下命令:
kubectl describe clusterresourceplacement <placement-name>
或者获取原始 YAML 输出:
kubectl get clusterresourceplacement <placement-name> -o yaml
顶级状态字段
状态部分包含以下顶级字段:
- selectedResources:位置选择的资源列表
- 条件:总体放置条件数组
- observedResourceIndex:当前资源快照的索引
- placementStatuses:每群集放置状态信息
以下部分详细介绍了每个字段。
了解所选资源
该 selectedResources 字段列出放置选择的所有资源。 此字段允许检查放置中是否包含预期的资源。 下面是一个示例:
selectedResources:
- kind: Namespace
  name: test
  version: v1
- kind: ConfigMap
  name: test-config
  namespace: test
  version: v1
  envelope:
    name: example-envelope
    namespace: test
    type: ResourceEnvelope
- kind: Deployment
  name: web-app
  namespace: test
  group: apps
  version: v1
- kind: Service
  name: web-service
  namespace: test
  version: v1
- kind: Secret
  name: app-secrets
  namespace: test
  version: v1
每个资源条目包括:
- 组:API 组(核心资源的空)
- 
              版本:API 版本(例如,v1)v1beta1
- kind:资源类型
- 名称:资源名称
- 命名空间:命名空间(用于命名空间资源)
- 
              信封:如果资源包装在信封中,则还会提供信封元数据,其中包括: - 名称:信封的名称
- 命名空间:信封的命名空间
- 
              类型:信封的类型(例如 ResourceEnvelope)
 
了解放置条件
该 conditions 数组提供有关整个放置的高级状态信息。 每个条件都遵循 Kubernetes 通用定义,该定义具有以下标准字段:
- 类型:条件类型(以下部分所述)
- 
              状态: True、False或Unknown
- 原因:条件的简短原因代码
- 消息:人工可读说明
- lastTransitionTime:条件上次更改时
- observedGeneration:设置条件时放置的生成
ClusterResourcePlacement 条件类型
以下条件类型可用于 ClusterResourcePlacement:
ClusterResourcePlacementScheduled
指示放置是否成功计划到目标群集。
- True:根据放置策略选择所有必需的群集
- False:计划失败(例如,群集可用不足)
- 未知:计划决策处于挂起状态
conditions:
- type: ClusterResourcePlacementScheduled
  status: "True"
  reason: SchedulingPolicyFulfilled
  message: "found all the clusters needed as specified by the scheduling policy"
  lastTransitionTime: "2023-11-10T08:14:52Z"
  observedGeneration: 5
ClusterResourcePlacementRolloutStarted
指示是否在所选群集中启动推出。
- True:资源已开始向计划群集推出
- False:尚未启动推出
- 未知:推出决策处于挂起状态
conditions:
- type: ClusterResourcePlacementRolloutStarted
  status: "True"
  reason: RolloutStarted
  message: "All 3 cluster(s) start rolling out the latest resource"
  lastTransitionTime: "2023-11-10T08:15:30Z"
  observedGeneration: 5
ClusterResourcePlacementOverridden
指示是否成功应用资源替代。
- True:处理所有适用的替代
- False:某些替代无法应用
- 未知:替代处理处于挂起状态
conditions:
- type: ClusterResourcePlacementOverridden
  status: "True"
  reason: NoOverrideSpecified
  message: "No override rules are configured for the selected resources"
  lastTransitionTime: "2023-11-10T08:15:45Z"
  observedGeneration: 5
ClusterResourcePlacementWorkSynchronized
指示是否在中心群集的每个群集命名空间中创建工作对象。
- True:所有工作对象都同步
- False:工作同步失败或不完整
- 未知:工作同步挂起
conditions:
- type: ClusterResourcePlacementWorkSynchronized  
  status: "True"
  reason: SynchronizeSucceeded
  message: "All 2 cluster(s) are synchronized to the latest resources on the hub cluster"
  lastTransitionTime: "2023-11-10T08:23:43Z"
  observedGeneration: 5
ClusterResourcePlacementApplied
指示是否将所有资源成功应用于成员群集。
- True:已成功应用于所有目标群集的所有资源
- 
              False:某些资源无法应用(检查 failedPlacements)
- 未知:应用作挂起
conditions:
- type: ClusterResourcePlacementApplied
  status: "True"
  reason: ApplySucceeded
  message: "The selected resources are successfully applied to 3 clusters"
  lastTransitionTime: "2023-11-10T08:16:15Z"
  observedGeneration: 5
ClusterResourcePlacementAvailable
指示放置的资源是否全部可用,并且已准备好在成员群集上。
- True:所有目标群集上都提供所有资源
- False:某些资源尚不可用
- 未知:可用性检查处于挂起状态
conditions:
- type: ClusterResourcePlacementAvailable
  status: "True"
  reason: ResourceAvailable
  message: "The selected resources in 3 clusters are available now"
  lastTransitionTime: "2023-11-10T08:16:30Z"
  observedGeneration: 5
ClusterResourcePlacementDiffReported
指示是否报告配置差异(使用 ReportDiff 策略时)。
- True:完整的差异报告可用
- False:差异报告失败或不完整
- 未知:差异报告挂起
conditions:
- type: ClusterResourcePlacementDiffReported
  status: "True"
  reason: DiffReportComplete
  message: "Configuration differences are reported for all target clusters"
  lastTransitionTime: "2023-11-10T08:16:45Z"
  observedGeneration: 5
了解资源快照
该 observedResourceIndex 字段指示当前正在部署的资源快照:
observedResourceIndex: "1"
当以下情况时,Fleet Manager 会创建资源快照:
- 资源选择器更改
- 已修改所选资源
每个快照都有唯一的索引。 可以使用以下方法查看快照:
kubectl get clusterresourcesnapshot --selector=kubernetes-fleet.io/resource-index=1
了解每个群集的放置状态
该 placementStatuses 数组包含放置或尝试放置资源的每个群集的详细状态:
placementStatuses:
- clusterName: aks-member-1
  observedResourceIndex: "1"
  conditions:
    - type: ResourceScheduled
      status: "True"
      reason: ScheduleSucceeded
      message: "Successfully scheduled resources for placement in aks-member-1"
      lastTransitionTime: "2023-11-10T08:14:52Z"
      observedGeneration: 5
    - type: RolloutStarted
      status: "True"
      reason: RolloutStarted
      message: "Detected the new changes on the resources and started the rollout process"
      lastTransitionTime: "2023-11-10T08:15:30Z"
      observedGeneration: 5
    - type: Overridden
      status: "True"
      reason: NoOverrideSpecified
      message: "No override rules are configured for the selected resources"
      lastTransitionTime: "2023-11-10T08:15:45Z"
      observedGeneration: 5
    - type: WorkSynchronized
      status: "True"
      reason: AllWorkSynced
      message: "All of the works are synchronized to the latest"
      lastTransitionTime: "2023-11-10T08:16:00Z"
      observedGeneration: 5
    - type: Applied
      status: "True"
      reason: AllWorkHaveBeenApplied
      message: "All corresponding work objects are applied"
      lastTransitionTime: "2023-11-10T08:16:15Z"
      observedGeneration: 5
    - type: Available
      status: "True"
      reason: ResourceAvailable
      message: "All resources are available on the target cluster"
      lastTransitionTime: "2023-11-10T08:16:30Z"
      observedGeneration: 5
  failedPlacements: []
  driftedPlacements: []
  diffedPlacements: []
- clusterName: aks-member-2
  observedResourceIndex: "1"
  conditions:
    - type: ResourceScheduled
      status: "True"
      reason: ScheduleSucceeded
      message: "Successfully scheduled resources for placement in aks-member-2"
      lastTransitionTime: "2023-11-10T08:14:52Z"
      observedGeneration: 5
    - type: Applied
      status: "False"
      reason: AppliedManifestFailedReason
      message: "Failed to apply some manifests"
      lastTransitionTime: "2023-11-10T08:16:15Z"
      observedGeneration: 5
  failedPlacements:
    - kind: Deployment
      name: web-app
      namespace: test
      version: apps/v1
      condition:
        type: Applied
        status: "False"
        reason: AppliedManifestFailedReason
        message: "Failed to apply manifest: insufficient resources"
        lastTransitionTime: "2023-11-10T08:16:15Z"
按群集条件类型
每个群集的状态包括跟踪部署生命周期的条件:
ResourceScheduled
指示是否已成功选择群集进行放置。
- type: ResourceScheduled
  status: "True"
  reason: ScheduleSucceeded
  message: "Successfully scheduled resources for placement in aks-member-1 (affinity score: 0, topology spread score: 0): picked by scheduling policy"
  lastTransitionTime: "2023-11-10T08:14:52Z"
  observedGeneration: 5
RolloutStarted
指示是否已在此特定群集上启动推出。
- type: RolloutStarted
  status: "True"
  reason: RolloutStarted
  message: "Detected new changes on the resources and started the rollout process"
  lastTransitionTime: "2023-11-10T08:15:30Z"
  observedGeneration: 5
重写
指示是否为此群集应用资源替代。
- type: Overridden
  status: "True"
  reason: NoOverrideSpecified
  message: "No override rules are configured for the selected resources"
  lastTransitionTime: "2023-11-10T08:15:45Z"
  observedGeneration: 5
WorkSynchronized
指示是否为此群集创建工作对象。
- type: WorkSynchronized
  status: "True"
  reason: AllWorkSynced
  message: "All of the works are synchronized to the latest"
  lastTransitionTime: "2023-11-10T08:16:00Z"
  observedGeneration: 5
已应用
指示是否成功将所有资源应用于此群集。
- type: Applied
  status: "True"
  reason: AllWorkHaveBeenApplied
  message: "All corresponding work objects are applied"
  lastTransitionTime: "2023-11-10T08:16:15Z"
  observedGeneration: 5
可用
指示所有资源是否都可用,并已在此群集上准备就绪。
- type: Available
  status: "True"
  reason: ResourceAvailable
  message: "All resources are available on the target cluster"
  lastTransitionTime: "2023-11-10T08:16:30Z"
  observedGeneration: 5
DiffReported
指示是否报告此群集的配置差异。
- type: DiffReported
  status: "True"
  reason: DiffReportComplete
  message: "Configuration differences are reported for this cluster"
  lastTransitionTime: "2023-11-10T08:16:45Z"
  observedGeneration: 5
了解失败的位置
如果资源无法应用于群集,则会在 failedPlacements 数组中记录详细信息:
failedPlacements:
- kind: Deployment
  name: my-app
  namespace: default
  version: apps/v1
  condition:
    type: Applied
    status: "False"
    reason: AppliedManifestFailedReason
    message: "Failed to apply manifest: namespaces 'app' not found"
    lastTransitionTime: "2023-12-06T00:09:53Z"
  envelope:
    name: example
    namespace: app
    type: ResourceEnvelope
- kind: Service
  name: my-service
  namespace: default
  version: v1
  condition:
    type: Applied
    status: "False"
    reason: AppliedManifestFailedReason
    message: "Failed to apply manifest: Service 'my-service' is forbidden: User 'system:serviceaccount:fleet-system:fleet-agent' cannot create resource 'services' in API group '' in the namespace 'default'"
    lastTransitionTime: "2023-12-06T00:10:15Z"
- kind: ConfigMap
  name: app-config
  namespace: production
  version: v1
  condition:
    type: Applied
    status: "False"
    reason: AppliedManifestFailedReason
    message: "Failed to apply manifest: configmaps 'app-config' already exists"
    lastTransitionTime: "2023-12-06T00:10:30Z"
每个失败的位置包括:
- 资源标识:组、版本、类型、名称、命名空间
- 条件:特定故障条件
- 信封:信封信息(如果适用)
了解偏移位置
机群管理器始终报告偏离其所需状态的资源:
driftedPlacements:
- kind: Namespace
  name: web
  version: v1
  observationTime: "2025-03-19T06:50:25Z"
  firstDriftedObservedTime: "2025-03-19T06:49:54Z"
  targetClusterObservedGeneration: 12
  observedDrifts:
  - path: "/metadata/labels/owner"
    valueInHub: "simon"
    valueInMember: "chen"
  - path: "/metadata/annotations/purpose"
    valueInHub: "production"
    valueInMember: "testing"
- kind: Deployment
  name: web-app
  namespace: web
  group: apps
  version: v1
  observationTime: "2025-03-19T06:50:25Z"
  firstDriftedObservedTime: "2025-03-19T06:49:54Z"
  targetClusterObservedGeneration: 8
  observedDrifts:
  - path: "/spec/replicas"
    valueInHub: "3"
    valueInMember: "5"
  - path: "/spec/template/spec/containers/0/image"
    valueInHub: "nginx:1.20"
    valueInMember: "nginx:1.21"
- kind: ConfigMap
  name: app-config
  namespace: web
  version: v1
  observationTime: "2025-03-19T06:50:25Z"
  firstDriftedObservedTime: "2025-03-19T06:49:54Z"
  targetClusterObservedGeneration: 5
  observedDrifts:
  - path: "/data/environment"
    valueInHub: "production"
    valueInMember: "staging"
每个偏移位置包括:
- 资源标识:组、版本、类型、名称、命名空间
- 观察时间:上次观测偏移时间
- firstDriftedObservedTime:首次检测到偏移时
- targetClusterObservedGeneration:在成员群集上生成资源
- observedDrifts:配置差异的详细列表
了解差异放置
使用 ReportDiff 应用策略时,Fleet Manager 报告配置差异:
diffedPlacements:
- kind: Service
  name: my-service
  namespace: default
  version: v1
  observationTime: "2025-03-19T06:50:25Z"
  firstDiffedObservedTime: "2025-03-19T06:49:54Z"
  targetClusterObservedGeneration: 8
  observedDiffs:
  - path: "/spec/ports/0/nodePort"
    valueInHub: ""
    valueInMember: "30080"
  - path: "/spec/clusterIP"
    valueInHub: ""
    valueInMember: "10.96.100.200"
- kind: Deployment
  name: web-app
  namespace: default
  group: apps
  version: v1
  observationTime: "2025-03-19T06:50:25Z"
  firstDiffedObservedTime: "2025-03-19T06:49:54Z"
  targetClusterObservedGeneration: 12
  observedDiffs:
  - path: "/status/replicas"
    valueInHub: ""
    valueInMember: "3"
  - path: "/status/readyReplicas"
    valueInHub: ""
    valueInMember: "3"
  - path: "/metadata/generation"
    valueInHub: "1"
    valueInMember: "2"
差异放置的结构与偏移放置类似,但用于不同的方案:
- 偏移位置:在应用资源但随后更改时使用
- 差异放置:与 ReportDiff 策略一起使用或未满足接管条件时
监视放置进度
若要有效监视放置进度,请检查以下关键指标:
- 
              放置的资源:验证 ClusterResourcePlacementWorkSynchronized是否为 True
- 
              总体运行状况:查看 ClusterResourcePlacementApplied条件
- 按群集状态:查看每个目标群集的条件
- 
              失败的位置:检查数组中的任何 failedPlacements条目
完成状态示例
下面是显示 ClusterResourcePlacement 的完整状态的综合示例:
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterResourcePlacement
metadata:
  name: web-app-placement
  generation: 5
spec:
  resourceSelectors:
  - group: ""
    kind: Namespace
    name: web-app
    version: v1
  - group: apps
    kind: Deployment
    name: web-server
    namespace: web-app
    version: v1
  - group: ""
    kind: Service
    name: web-service
    namespace: web-app
    version: v1
  policy:
    placementType: PickN
    numberOfClusters: 2
    affinity:
      clusterAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          clusterSelectorTerms:
          - matchLabels:
              region: chinanorth3
status:
  conditions:
  - type: ClusterResourcePlacementScheduled
    status: "True"
    reason: SchedulingPolicyFulfilled
    message: "found all the clusters needed as specified by the scheduling policy"
    lastTransitionTime: "2023-11-10T08:14:52Z"
    observedGeneration: 5
  - type: ClusterResourcePlacementRolloutStarted
    status: "True"
    reason: RolloutStarted
    message: "All 2 cluster(s) start rolling out the latest resource"
    lastTransitionTime: "2023-11-10T08:15:30Z"
    observedGeneration: 5
  - type: ClusterResourcePlacementOverridden
    status: "True"
    reason: NoOverrideSpecified
    message: "No override rules are configured for the selected resources"
    lastTransitionTime: "2023-11-10T08:15:45Z"
    observedGeneration: 5
  - type: ClusterResourcePlacementWorkSynchronized
    status: "True"
    reason: SynchronizeSucceeded
    message: "All 2 cluster(s) are synchronized to the latest resources on the hub cluster"
    lastTransitionTime: "2023-11-10T08:16:00Z"
    observedGeneration: 5
  - type: ClusterResourcePlacementApplied
    status: "True"
    reason: ApplySucceeded
    message: "The selected resources are successfully applied to 2 clusters"
    lastTransitionTime: "2023-11-10T08:16:15Z"
    observedGeneration: 5
  - type: ClusterResourcePlacementAvailable
    status: "True"
    reason: ResourceAvailable
    message: "The selected resources in 2 cluster are available now"
    lastTransitionTime: "2023-11-10T08:16:30Z"
    observedGeneration: 5
  observedResourceIndex: "1"
  selectedResources:
  - group: ""
    kind: Namespace
    name: web-app
    version: v1
  - group: apps
    kind: Deployment
    name: web-server
    namespace: web-app
    version: v1
  - group: ""
    kind: Service
    name: web-service
    namespace: web-app
    version: v1
  placementStatuses:
  - clusterName: aks-chinanorth3-1
    observedResourceIndex: "1"
    conditions:
    - type: ResourceScheduled
      status: "True"
      reason: ScheduleSucceeded
      message: "Successfully scheduled resources for placement in aks-chinanorth3-1 (affinity score: 0, topology spread score: 0): picked by scheduling policy"
      lastTransitionTime: "2023-11-10T08:14:52Z"
      observedGeneration: 5
    - type: RolloutStarted
      status: "True"
      reason: RolloutStarted
      message: "Detected the new changes on the resources and started the rollout process"
      lastTransitionTime: "2023-11-10T08:15:30Z"
      observedGeneration: 5
    - type: Overridden
      status: "True"
      reason: NoOverrideSpecified
      message: "No override rules are configured for the selected resources"
      lastTransitionTime: "2023-11-10T08:15:45Z"
      observedGeneration: 5
    - type: WorkSynchronized
      status: "True"
      reason: AllWorkSynced
      message: "All of the works are synchronized to the latest"
      lastTransitionTime: "2023-11-10T08:16:00Z"
      observedGeneration: 5
    - type: Applied
      status: "True"
      reason: AllWorkHaveBeenApplied
      message: "All corresponding work objects are applied"
      lastTransitionTime: "2023-11-10T08:16:15Z"
      observedGeneration: 5
    - type: Available
      status: "True"
      reason: ResourceAvailable
      message: "All resources are available on the target cluster"
      lastTransitionTime: "2023-11-10T08:16:30Z"
      observedGeneration: 5
    failedPlacements: []
    driftedPlacements: []
    diffedPlacements: []
  - clusterName: aks-chinanorth3-2
    observedResourceIndex: "1"
    conditions:
    - type: ResourceScheduled
      status: "True"
      reason: ScheduleSucceeded
      message: "Successfully scheduled resources for placement in aks-chinanorth3-2 (affinity score: 0, topology spread score: 0): picked by scheduling policy"
      lastTransitionTime: "2023-11-10T08:14:52Z"
      observedGeneration: 5
    - type: RolloutStarted
      status: "True"
      reason: RolloutStarted
      message: "Detected new changes on the resources and started the rollout process"
      lastTransitionTime: "2023-11-10T08:15:30Z"
      observedGeneration: 5
    - type: Overridden
      status: "True"
      reason: NoOverrideSpecified
      message: "No override rules are configured for the selected resources"
      lastTransitionTime: "2023-11-10T08:15:45Z"
      observedGeneration: 5
    - type: WorkSynchronized
      status: "True"
      reason: AllWorkSynced
      message: "All of the works are synchronized to the latest"
      lastTransitionTime: "2023-11-10T08:16:00Z"
      observedGeneration: 5
    - type: Applied
      status: "True"
      reason: AllWorkHaveBeenApplied
      message: "All corresponding work objects are applied"
      lastTransitionTime: "2023-11-10T08:16:15Z"
      observedGeneration: 5
    - type: Available
      status: "True"
      reason: ResourceAvailable
      message: "All resources are available on the target cluster"
      lastTransitionTime: "2023-11-10T08:16:30Z"
      observedGeneration: 5
    failedPlacements: []
    driftedPlacements: []
    diffedPlacements: []
此示例显示:
- 成功的计划:放置能够找到两个与放置策略匹配的群集
- 成功推出:所有资源都部署到两个目标群集
- 无替代:未配置或需要任何资源替代
- 同步的工作:创建和同步工作对象
- 已应用的资源:已成功应用所有资源
- 可用资源:所有资源都正在运行且可用
- 清理状态:无失败、偏移或差异放置
相关内容
若要了解有关资源传播的详细信息,请参阅以下资源: