群集 Resource Manager 与 Service Fabric 群集管理的集成Cluster resource manager integration with Service Fabric cluster management

Service Fabric 群集资源管理器不会在 Service Fabric 中驱动升级,但会关注升级。The Service Fabric Cluster Resource Manager doesn't drive upgrades in Service Fabric, but it is involved. 群集 Resource Manager 帮助进行管理的第一种方式是跟踪群集及其中服务的所需状态。The first way that the Cluster Resource Manager helps with management is by tracking the desired state of the cluster and the services inside it. 无法将群集放入所需配置时,群集 Resource Manager 会发出运行状况报告。The Cluster Resource Manager sends out health reports when it cannot put the cluster into the desired configuration. 例如,如果容量不足,则群集资源管理器会发出运行状况警告和错误,指示该问题。For example, if there is insufficient capacity the Cluster Resource Manager sends out health warnings and errors indicating the problem. 集成的另一个部分与升级的工作方式有关。Another piece of integration has to do with how upgrades work. 在升级期间,群集资源管理器会稍微改变其行为。The Cluster Resource Manager alters its behavior slightly during upgrades.

运行状况集成Health integration

群集资源管理器会持续跟踪为放置服务而定义的规则。The Cluster Resource Manager constantly tracks the rules you have defined for placing your services. 还会将节点上和群集中的每个指标作为一个整体,跟踪其剩余容量。It also tracks the remaining capacity for each metric on the nodes and in the cluster and in the cluster as a whole. 如果不满足这些规则或者容量不足,则会发出运行状况警告和错误。If it can't satisfy those rules or if there is insufficient capacity, health warnings and errors are emitted. 例如,如果某个节点超出容量,Resource Manager 将尝试通过移动服务来修复这种问题。For example, if a node is over capacity and the Cluster Resource Manager will try to fix the situation by moving services. 如果无法纠正,群集资源管理器将发出运行状况警告,指出哪个节点超出容量,以及警告是针对哪些指标。If it can't correct the situation it emits a health warning indicating which node is over capacity, and for which metrics.

Resource Manager 发出运行状况警告的另一个示例是发生了放置约束违规情况。Another example of the Resource Manager's health warnings is violations of placement constraints. 例如,如果已定义放置约束(如 "NodeColor == Blue"),而资源管理器检测到违反该约束的情况,则会发出运行状况警告。For example, if you have defined a placement constraint (such as "NodeColor == Blue") and the Resource Manager detects a violation of that constraint, it emits a health warning. 这一点适用于自定义约束和默认约束(例如容错域和升级域约束)。This is true for custom constraints and the default constraints (like the Fault Domain and Upgrade Domain constraints).

下面是此类运行状况报告的示例。Here's an example of one such health report. 在这种情况下,运行状况报告适用于系统服务的分区之一。In this case, the health report is for one of the system service's partitions. 运行状况消息指出,该分区的副本临时打包到了过少的升级域。The health message indicates the replicas of that partition are temporarily packed into too few Upgrade Domains.

PS C:\Users\User > Get-ServiceFabricPartitionHealth -PartitionId '00000000-0000-0000-0000-000000000001'

PartitionId           : 00000000-0000-0000-0000-000000000001
AggregatedHealthState : Warning
UnhealthyEvaluations  :
                        Unhealthy event: SourceId='System.PLB', Property='ReplicaConstraintViolation_UpgradeDomain', HealthState='Warning', ConsiderWarningAsError=false.

ReplicaHealthStates   :
                        ReplicaId             : 130766528804733380
                        AggregatedHealthState : Ok

                        ReplicaId             : 130766528804577821
                        AggregatedHealthState : Ok

                        ReplicaId             : 130766528854889931
                        AggregatedHealthState : Ok

                        ReplicaId             : 130766528804577822
                        AggregatedHealthState : Ok

                        ReplicaId             : 130837073190680024
                        AggregatedHealthState : Ok

HealthEvents          :
                        SourceId              : System.PLB
                        Property              : ReplicaConstraintViolation_UpgradeDomain
                        HealthState           : Warning
                        SequenceNumber        : 130837100116930204
                        SentAt                : 8/10/2015 7:53:31 PM
                        ReceivedAt            : 8/10/2015 7:53:33 PM
                        TTL                   : 00:01:05
                        Description           : The Load Balancer has detected a Constraint Violation for this Replica: fabric:/System/FailoverManagerService Secondary Partition 00000000-0000-0000-0000-000000000001 is
                        violating the Constraint: UpgradeDomain Details: UpgradeDomain ID -- 4, Replica on NodeName -- Node.8 Currently Upgrading -- false Distribution Policy -- Packing
                        RemoveWhenExpired     : True
                        IsExpired             : False
                        Transitions           : Ok->Warning = 8/10/2015 7:13:02 PM, LastError = 1/1/0001 12:00:00 AM

下面是此运行状况消息指出的情况:Here's what this health message is telling us is:

  1. 所有副本本身处于正常状态:每个副本的 AggregatedHealthState 均为正常All the replicas themselves are healthy: Each has AggregatedHealthState : Ok
  2. 当前违反了升级域分发约束。The Upgrade Domain distribution constraint is currently being violated. 这表示特定的升级域在此分区中拥有的副本数超出了预期。This means a particular Upgrade Domain has more replicas from this partition than it should.
  3. 哪些节点包含会引起违规的副本。Which node contains the replica causing the violation. 在这种情况下,是名为“Node.8”的节点In this case it's the node with the name "Node.8"
  4. 此分区中是否正在进行升级(“当前正在升级 -- false”)Whether an upgrade is currently happening for this partition ("Currently Upgrading -- false")
  5. 此服务的分发策略:“分发策略 -- 打包”。The distribution policy for this service: "Distribution Policy -- Packing". 这受 RequireDomainDistribution 放置策略的控制。This is governed by the RequireDomainDistribution placement policy. “打包”指示在此情况下不需要 DomainDistribution,从而使我们知道未对此服务指定放置策略 。"Packing" indicates that in this case DomainDistribution was not required, so we know that placement policy was not specified for this service.
  6. 报告发生时间 -- 2015/8/10 晚上 7:13:02When the report happened - 8/10/2015 7:13:02 PM

此类信息丰富了生产环境中触发的警报,可让用户知道某个地方出错了,还可用于检测和暂停错误升级。Information like this powers alerts that fire in production to let you know something has gone wrong and is also used to detect and halt bad upgrades. 在此情况下,我们可以调查资源管理器为何必须将副本打包到升级域。In this case, we'd want to see if we can figure out why the Resource Manager had to pack the replicas into the Upgrade Domain. 例如,打包通常是暂时的,因为其他升级域中的节点已关闭。Usually packing is transient because the nodes in the other Upgrade Domains were down, for example.

假设群集资源管理器正尝试放置某些服务,但没有任何可行的解决方案。Let's say the Cluster Resource Manager is trying to place some services, but there aren't any solutions that work. 如果不能放置服务,通常是由于以下原因之一引起的:When services can't be placed, it is usually for one of the following reasons:

  1. 某个暂时性状态导致无法正确放置此服务实例或副本Some transient condition has made it impossible to place this service instance or replica correctly
  2. 不满足服务的放置要求。The service's placement requirements are unsatisfiable.

在这些情况下,群集资源管理器的运行状况报告可帮助确定不能放置服务的原因。In these cases, health reports from the Cluster Resource Manager help you determine why the service can't be placed. 我们将此过程称为“约束消除序列”。We call this process the constraint elimination sequence. 在此过程中,系统将逐步了解配置的约束如何影响服务,并记录约束消除的因素。During it, the system walks through the configured constraints affecting the service and records what they eliminate. 这样,当无法放置服务时,便可以看到哪些节点已被消除及其原因。This way when services aren't able to be placed, you can see which nodes were eliminated and why.

约束类型Constraint types

接下来,我们讨论一下这些运行状况报告中的各种约束。Let's talk about each of the different constraints in these health reports. 不能放置副本时,将看到与这些约束相关的运行状况消息。You will see health messages related to these constraints when replicas can't be placed.

  • ReplicaExclusionStatic 和 ReplicaExclusionDynamic:这些约束指示系统拒绝某解决方案是由于同一分区中的两个服务对象必须放置在同一节点上 。ReplicaExclusionStatic and ReplicaExclusionDynamic: These constraints indicates that a solution was rejected because two service objects from the same partition would have to be placed on the same node. 不允许这样操作,因为该节点的失败会过度地影响该分区。This isn't allowed because then failure of that node would overly impact that partition. ReplicaExclusionStatic 和 ReplicaExclusionDynamic 遵循几乎相同的规则,有所差别也无关紧要。ReplicaExclusionStatic and ReplicaExclusionDynamic are almost the same rule and the differences don't really matter. 如果看到的约束消除序列包含 ReplicaExclusionStatic 或 ReplicaExclusionDynamic 约束,群集资源管理器就会认为没有足够的节点。If you are seeing a constraint elimination sequence containing either the ReplicaExclusionStatic or ReplicaExclusionDynamic constraint, the Cluster Resource Manager thinks that there aren't enough nodes. 这要求剩余的解决方案能够使用这些不允许使用的无效放置。This requires remaining solutions to use these invalid placements which are disallowed. 序列中的其他约束通常会告诉我们首先要消除节点的原因。The other constraints in the sequence will usually tell us why nodes are being eliminated in the first place.
  • PlacementConstraint:如果看到此消息,表示已消除了一些节点,因为它们不符合服务的放置约束。PlacementConstraint: If you see this message, it means that we eliminated some nodes because they didn't match the service's placement constraints. 我们在此消息中描绘当前配置的放置约束。We trace out the currently configured placement constraints as a part of this message. 如果定义了放置约束,则这种情况是正常的。This is normal if you have a placement constraint defined. 但是,如果放置约束错误地导致消除了过多的节点,则会看到这种结果。However, if placement constraint is incorrectly causing too many nodes to be eliminated this is how you would notice.
  • NodeCapacity:此约束表示群集资源管理器无法将副本放在指定的节点上,因为这样放置会超出容量。NodeCapacity: This constraint means that the Cluster Resource Manager couldn't place the replicas on the indicated nodes because that would put them over capacity.
  • Affinity:此约束表示无法将副本放在受影响的节点上,因为这会导致违反相关性约束。Affinity: This constraint indicates that we couldn't place the replica on the affected nodes since it would cause a violation of the affinity constraint. 此文介绍了有关相关性的详细信息。More information on affinity is in this article
  • FaultDomainUpgradeDomain:如果将副本放在指定的节点上会导致副本打包在特定的容错域或升级域中,此约束将消除节点。FaultDomain and UpgradeDomain: This constraint eliminates nodes if placing the replica on the indicated nodes would cause packing in a particular fault or upgrade domain. 容错域与升级域约束及最终行为中的主题提供了几个介绍此约束的示例Several examples discussing this constraint are presented in the topic on fault and upgrade domain constraints and resulting behavior
  • PreferredLocation:通常我们看不到这个会将节点从解决方案中删除的约束,因为该约束默认作为优化运行。PreferredLocation: You shouldn't normally see this constraint removing nodes from the solution since it runs as an optimization by default. 首选的位置约束还会出现在升级期间。The preferred location constraint is also present during upgrades. 在升级期间,该约束用于将服务移回到开始升级时所在的位置。During upgrade it is used to move services back to where they were when the upgrade started.

将节点列入阻止列表Blocklisting Nodes

群集资源管理器报告的另一个运行状况消息是节点何时列入阻止列表。Another health message the Cluster Resource Manager reports is when nodes are blocklisted. 可以将列入阻止列表看作自动应用的临时约束。You can think of blocklisting as a temporary constraint that is automatically applied for you. 如果节点在启动该服务类型的实例时遇到重复的失败,则会将这些节点列入阻止列表。Nodes get blocklisted when they experience repeated failures when launching instances of that service type. 根据每个服务类型,将节点列入阻止列表。Nodes are blocklisted on a per-service-type basis. 系统会由于一种服务类型(非另一种)而将某个节点列入阻止列表。A node may be blocklisted for one service type but not another.

会看到通常在开发过程中开始列入阻止列表:一些 bug 会导致服务主机在启动时发生故障。You'll see blocklisting kick in often during development: some bug causes your service host to crash on startup. Service Fabric 多次尝试创建服务主机,但一直发生故障。Service Fabric tries to create the service host a few times, and the failure keeps occurring. 几次尝试后,会将该节点列入阻止列表,群集资源管理器会尝试在其他位置创建该服务。After a few attempts, the node gets blocklisted, and the Cluster Resource Manager will try to create the service elsewhere. 如果该故障在多个节点上发生,则可能最终会将群集中的所有有效节点列入阻止列表。If that failure keeps happening on multiple nodes, it's possible that all of the valid nodes in the cluster end up blocked. 列入阻止列表还可移除很多节点,导致可用节点数不足,无法成功启动服务以满足所需规模。Blocklisting can also remove so many nodes that not enough can successfully launch the service to meet the desired scale. 通常会看到群集资源管理器的其他错误或警告,指示服务低于所需的副本数或实例数,还会看到运行状况消息,指示导致列入阻止列表的首要故障是什么。You'll typically see additional errors or warnings from the Cluster Resource Manager indicating that the service is below the desired replica or instance count, as well as health messages indicating what the failure is that's leading to the blocklisting in the first place.

列入阻止列表不是永久性条件。Blocklisting is not a permanent condition. 几分钟后,会从阻止列表删除该节点,并且 Service Fabric 可能会再次激活该节点上的服务。After a few minutes, the node is removed from the blocklist and Service Fabric may activate the services on that node again. 如果服务仍失败,则会再次因该服务类型而将该节点列入阻止列表。If services continue to fail, the node is blocklisted for that service type again.

约束优先级Constraint priorities

警告

不建议更改约束优先级,并且可能对群集造成严重的不利影响。Changing constraint priorities is not recommended and may have significant adverse effects on your cluster. 提供以下信息,供默认约束优先级及其行为参考。The below information is provided for reference of the default constraint priorities and their behavior.

在所有约束中,可能觉得:“嘿 – 我认为容错域约束在系统中是最重要的。With all of these constraints, you may have been thinking "Hey - I think that fault domain constraints are the most important thing in my system. 为了确保不违反容错域约束,我情愿违反其他约束。”In order to ensure the fault domain constraint isn't violated, I'm willing to violate other constraints."

可为约束配置不同的优先级别。Constraints can be configured with different priority levels. 其中包括:These are:

  • “硬”(0)"hard" (0)
  • “软”(1)"soft" (1)
  • “优化”(2)"optimization" (2)
  • “关”(-1)。"off" (-1).

大多数约束默认配置为硬约束。Most of the constraints are configured as hard constraints by default.

更改约束的优先级并不常见。Changing the priority of constraints is uncommon. 有时需要更改约束优先级,通常是为了解决已影响环境的其他 bug 或行为。There have been times where constraint priorities needed to change, usually to work around some other bug or behavior that was impacting the environment. 一般而言,约束优化级基础结构的弹性可以应对各种问题,但我们并不经常需要利用这种弹性。Generally the flexibility of the constraint priority infrastructure has worked very well, but it isn't needed often. 在大部分时间内,每个组成部分使用其默认优先级就能正常运作。Most of the time everything sits at their default priorities.

优先级别不表示会违反给定的约束,也不表示会始终满足给定约束 。The priority levels don't mean that a given constraint will be violated, nor that it will always be met. 约束优先级定义强制执行约束的顺序。Constraint priorities define an order in which constraints are enforced. 优先级定义无法满足所有约束时的折衷方案。Priorities define the tradeoffs when it is impossible to satisfy all constraints. 通常可以满足所有约束,除非环境中还有其他要求。Usually all the constraints can be satisfied unless there's something else going on in the environment. 一些将导致约束冲突的方案示例包括违反约束或大量的并发故障。Some examples of scenarios that will lead to constraint violations are conflicting constraints, or large numbers of concurrent failures.

在某些高级场合中,可以更改约束优先级。In advanced situations, you can change the constraint priorities. 例如,假如希望确保有必要解决节点容量问题时始终违反相关性。For example, say you wanted to ensure that affinity would always be violated when necessary to solve node capacity issues. 为此,可将相关性约束的优先级设置为“软”(1),将容量约束保持设置为“硬”(0)。To achieve this, you could set the priority of the affinity constraint to "soft" (1) and leave the capacity constraint set to "hard" (0).

以下配置文件中指定了不同约束的默认优先级值:The default priority values for the different constraints are specified in the following config:

ClusterManifest.xmlClusterManifest.xml

<Section Name="PlacementAndLoadBalancing">
    <Parameter Name="PlacementConstraintPriority" Value="0" />
    <Parameter Name="CapacityConstraintPriority" Value="0" />
    <Parameter Name="AffinityConstraintPriority" Value="0" />
    <Parameter Name="FaultDomainConstraintPriority" Value="0" />
    <Parameter Name="UpgradeDomainConstraintPriority" Value="1" />
    <Parameter Name="PreferredLocationConstraintPriority" Value="2" />
</Section>

通过 ClusterConfig.json 进行独立部署或将 Template.json 用于 Azure 托管群集:via ClusterConfig.json for Standalone deployments or Template.json for Azure hosted clusters:

"fabricSettings": [
  {
    "name": "PlacementAndLoadBalancing",
    "parameters": [
      {
          "name": "PlacementConstraintPriority",
          "value": "0"
      },
      {
          "name": "CapacityConstraintPriority",
          "value": "0"
      },
      {
          "name": "AffinityConstraintPriority",
          "value": "0"
      },
      {
          "name": "FaultDomainConstraintPriority",
          "value": "0"
      },
      {
          "name": "UpgradeDomainConstraintPriority",
          "value": "1"
      },
      {
          "name": "PreferredLocationConstraintPriority",
          "value": "2"
      }
    ]
  }
]

容错域和升级域约束Fault domain and upgrade domain constraints

群集资源管理器要保留在容错域和升级域之间分布的服务。The Cluster Resource Manager wants to keep services spread out among fault and upgrade domains. 它会将此作为群集资源管理器引擎内的约束进行建模。It models this as a constraint inside the Cluster Resource Manager's engine. 有关其具体用法和特定行为的详细信息,请参阅有关群集配置的文章。For more information on how they are used and their specific behavior, check out the article on cluster configuration.

群集资源管理器可能需要将一些副本打包到升级域,处理升级、故障或其他约束违规。The Cluster Resource Manager may need to pack a couple replicas into an upgrade domain in order to deal with upgrades, failures, or other constraint violations. 通常,仅当系统中的多种故障或其他扰乱因素阻止正确放置时,才会打包到故障或升级域。Packing into fault or upgrade domains normally happens only when there are several failures or other churn in the system preventing correct placement. 若要在这些情况下防止打包,可以利用 RequireDomainDistribution 放置策略If you wish to prevent packing even during these situations, you can utilize the RequireDomainDistribution placement policy. 请注意,作为副作用,这可能会影响服务可用性和可靠性,因此请仔细考虑。Note that this may affect service availability and reliability as a side effect, so consider it carefully.

如果已正确配置环境,则即使在升级期间,也会完全遵循所有约束。If the environment is configured correctly, all constraints are fully respected, even during upgrades. 关键的一点是,群集资源管理器会自动监视约束。The key thing is that the Cluster Resource Manager is watching out for your constraints. 检测到违规时,它会立即报告并尝试解决问题。When it detects a violation it immediately reports it and tries to correct the issue.

首选位置约束The preferred location constraint

PreferredLocation 约束稍有不同,因为它具有两种不同的用法。The PreferredLocation constraint is a little different, as it has two different uses. 此约束的一个用法是在应用程序升级过程中使用。One use of this constraint is during application upgrades. 群集资源管理器在升级过程中自动管理此约束。The Cluster Resource Manager automatically manages this constraint during upgrades. 它用于确保升级完成时该副本返回到其初始位置。It is used to ensure that when upgrades are complete that replicas return to their initial locations. PreferredLocation 约束的另一种用法是用于PreferredPrimaryDomain放置策略The other use of the PreferredLocation constraint is for the PreferredPrimaryDomain placement policy. 这两种用法都是优化,因此 PreferredLocation 约束是唯一默认设置为“Optimization”的约束。Both of these are optimizations, and hence the PreferredLocation constraint is the only constraint set to "Optimization" by default.

升级Upgrades

在应用程序和群集升级期间,群集 Resource Manager 也会提供帮助,在此过程中它会执行两个作业:The Cluster Resource Manager also helps during application and cluster upgrades, during which it has two jobs:

  • 确保群集的规则不受到影响ensure that the rules of the cluster are not compromised
  • 尝试帮助升级顺利完成try to help the upgrade go smoothly

保持强制实施规则Keep enforcing the rules

规则是需要注意的重点 – 在升级期间仍强制实施严格的约束(如放置约束和容量)。The main thing to be aware of is that the rules - the strict constraints like placement constraints and capacities - are still enforced during upgrades. 放置约束确保工作负载仅在受允许的情况下才能运行,即使在升级期间也是如此。Placement constraints ensure that your workloads only run where they are allowed to, even during upgrades. 对服务采取高度约束时,升级可能需要更长时间。When services are highly constrained, upgrades can take longer. 服务或运行服务的节点因某更新而关闭时,关于服务的运行位置,有以下几个选项。When the service or the node it is running on is brought down for an update there may be few options for where it can go.

智能替换Smart replacements

开始升级时,Resource Manager 将创建当前群集排列方式的快照。When an upgrade starts, the Resource Manager takes a snapshot of the current arrangement of the cluster. 每个升级域完成后,会尝试将该升级域中的服务恢复到其原始排列方式。As each Upgrade Domain completes, it attempts to return the services that were in that Upgrade Domain to their original arrangement. 这样,一个服务在升级过程中最多可以有两次转换。This way there are at most two transitions for a service during the upgrade. 一次是从受影响的节点移出,一次是移入。There is one move out of the affected node and one move back in. 将群集或服务恢复到升级前的状态还可确保升级不会对群集的布局造成影响。Returning the cluster or service to how it was before the upgrade also ensures the upgrade doesn't impact the layout of the cluster.

降低流动Reduced churn

升级期间还会发生另一种情况,那就是群集资源管理器关闭均衡。Another thing that happens during upgrades is that the Cluster Resource Manager turns off balancing. 阻止均衡可防止对升级本身做出不必要的反应,例如,为了升级而将服务移入空节点。Preventing balancing prevents unnecessary reactions to the upgrade itself, like moving services into nodes that were emptied for the upgrade. 如果有问题的升级是群集升级,则整个群集在升级期间会失衡。If the upgrade in question is a Cluster upgrade, then the entire cluster is not balanced during the upgrade. 约束检查一直处于活动状态,仅禁用基于主动指标均衡的移动。Constraint checks stay active, only movement based on the proactive balancing of metrics is disabled.

缓冲容量与升级Buffered Capacity & Upgrade

通常,即使群集受到约束或接近完整,我们还是希望升级能够完成。Generally you want the upgrade to complete even if the cluster is constrained or close to full. 在升级期间管理群集容量的能力比平常时候更为重要。Managing the capacity of the cluster is even more important during upgrades than usual. 根据升级域的数目,当在群集中进行升级时,必须迁移 5% 到 20% 的容量。Depending on the number of upgrade domains, between 5 and 20 percent of capacity must be migrated as the upgrade rolls through the cluster. 工作负荷必须转到某个位置。That work has to go somewhere. 这就是缓冲容量概念派上用场的地方。This is where the notion of buffered capacities is useful. 在正常操作过程中需要考虑缓冲容量。Buffered capacity is respected during normal operation. 如有必要,在升级过程中,群集资源管理器可能会将节点填充到其总容量(占用缓冲区)。The Cluster Resource Manager may fill nodes up to their total capacity (consuming the buffer) during upgrades if necessary.

后续步骤Next steps