在 Service Fabric 中配置和使用服务相关性Configuring and using service affinity in Service Fabric

相关性是一个控件,主要用于帮助简化将较大型的单体式应用程序转换到云和微服务领域。Affinity is a control that is provided mainly to help ease the transition of larger monolithic applications into the cloud and microservices world. 它也可以用作提升服务性能的优化,不过这样做可能会产生副作用。It is also used as an optimization for improving the performance of services, although doing so can have side effects.

假设正在将某个较大型应用,或者在设计时未将微服务纳入考虑的应用引入 Service Fabric 或任何分布式环境中。Let's say you're bringing a larger app, or one that just wasn't designed with microservices in mind, to Service Fabric (or any distributed environment). 此类转换很常见。This type of transition is common. 首先将整个应用提升到环境中、打包应用并确保应用可以平稳运行。You start by lifting the entire app into the environment, packaging it, and making sure it is running smoothly. 然后,开始将其分解成不同的、可彼此通信的较小服务。Then you start breaking it down into different smaller services that all talk to each other.

最终可能会发现应用程序将遇到一些问题。Eventually you may find that the application is experiencing some issues. 该问题通常是由下列类别的原因之一导致的:The issues usually fall into one of these categories:

  1. 单体式应用中的某个组件 X 与组件 Y 之间具有未记录的依赖关系,而用户刚刚将这些组件转换为独立的服务。Some component X in the monolithic app had an undocumented dependency on component Y, and you just turned those components into separate services. 由于这些服务现在在群集中的不同节点上运行,这种依赖关系已被破坏。Since these services are now running on different nodes in the cluster, they're broken.
  2. 这些组件通过(本地命名管道 | 共享内存 | 磁盘上的文件)进行通信,并且出于性能原因,它们实际需要能够立即写入到本地共享资源。These components communicate via (local named pipes | shared memory | files on disk) and they really need to be able to write to a shared local resource for performance reasons right now. 稍后可能会删除该硬依赖关系。That hard dependency gets removed later, maybe.
  3. 一切都没问题,但事实上,这两个组件之间的对话频繁且性能需求很高。Everything is fine, but it turns out that these two components are actually chatty/performance sensitive. 将它们移到独立的服务时,整体应用程序性能将受到严重影响,或者延迟会增大。When they moved them into separate services overall application performance tanked or latency increased. 因此,整个应用程序不符合预期。As a result, the overall application is not meeting expectations.

在这些情况下,我们不想丢失重构工作,也不想回到单体结构。In these cases, we don't want to lose our refactoring work, and don't want to go back to the monolith. 最后一个条件甚至可能是作为纯优化所需要的。The last condition may even be desirable as a plain optimization. 但是,我们需要某种意义上的定位,直到可以重新设计组件,使其像服务一样自然运行(或直到可以通过其他某种方式达到性能预期)为止。However, until we can redesign the components to work naturally as services (or until we can solve the performance expectations some other way) we're going to need some sense of locality.

怎么办?What to do? 嗯,可以尝试启用相关性。Well, you could try turning on affinity.

如何配置相关性How to configure affinity

若要设置相关性,可以定义两个不同服务之间的相关性关系。To set up affinity, you define an affinity relationship between two different services. 可以将相关性想象成在一个服务上“指向”另一个服务,同时假设“这个服务只有在那个服务正在运行时才能运行”。You can think of affinity as "pointing" one service at another and saying "This service can only run where that service is running." 有时我们将这种相关性称为父子关系(将子级指向父级)。Sometimes we refer to affinity as a parent/child relationship (where you point the child at the parent). 相关性可确保将一个服务的副本或实例放置在与另一个服务相同的节点上。Affinity ensures that the replicas or instances of one service are placed on the same nodes as those of another service.

ServiceCorrelationDescription affinityDescription = new ServiceCorrelationDescription();
affinityDescription.Scheme = ServiceCorrelationScheme.Affinity;
affinityDescription.ServiceName = new Uri("fabric:/otherApplication/parentService");
serviceDescription.Correlations.Add(affinityDescription);
await fabricClient.ServiceManager.CreateServiceAsync(serviceDescription);

备注

子服务只能参与到单个相关性关系。A child service can only participate in a single affinity relationship. 若要将子服务一次关联到两个父服务,有以下几个选项:If you wanted the child to be affinitized to two parent services at once you have a couple options:

  • 颠倒关系(将 parentService1 和 parentService2 指向当前子服务),或Reverse the relationships (have parentService1 and parentService2 point at the current child service), or
  • 按照约定将其中一个父服务指定为中心,并将所有服务指向该服务。Designate one of the parents as a hub by convention and have all services point at that service.

在群集中导致的放置行为应是相同的。The resulting placement behavior in the cluster should be the same.

不同的相关性选项Different affinity options

相关性通过多种相互关联的架构之一来表示,有两种不同的模式。Affinity is represented via one of several correlation schemes, and has two different modes. 相关性的最常见模式是所谓的 NonAlignedAffinity 模式。The most common mode of affinity is what we call NonAlignedAffinity. 在 NonAlignedAffinity 下,不同服务的副本或实例均放置在同一个节点上。In NonAlignedAffinity, the replicas or instances of the different services are placed on the same nodes. 另一种模式是 AlignedAffinity。The other mode is AlignedAffinity. 对齐的相关性仅适用于有状态服务。Aligned Affinity is useful only with stateful services. 配置两个有状态服务实现对齐的相关性,可确保这些服务的主要副本与其他服务的主要副本位于相同的节点上。Configuring two stateful services to have aligned affinity ensures that the primaries of those services are placed on the same nodes as each other. 它还能确保这些服务的每个次要副本对位于相同的节点上。It also causes each pair of secondaries for those services to be placed on the same nodes. 也可以针对有状态服务配置 NonAlignedAffinity(但不太常见)。It is also possible (though less common) to configure NonAlignedAffinity for stateful services. 使用 NonAlignedAffinity 时,两个有状态服务的不同副本会在相同的节点上运行,但二者的主副本最终会在不同的节点上。For NonAlignedAffinity, the different replicas of the two stateful services would run on the same nodes, but their primaries could end up on different nodes.

相关性模式及其影响

尽力而为的所需状态Best effort desired state

相关性关系是以尽力而为的方式获取的。An affinity relationship is best effort. 它不提供在同一可执行进程中运行所提供的归置或可靠性保证。It does not provide the same guarantees of collocation or reliability that running in the same executable process does. 具有相关性关系的服务是本质上不同的实体,可能失败或者被单独移动。The services in an affinity relationship are fundamentally different entities that can fail and be moved independently. 相关性关系也会中断,但这些换行符是临时的。An affinity relationship could also break, though these breaks are temporary. 例如,容量限制可能代表相关性关系中只有某些服务对象能够适用于给定的节点。For example, capacity limitations may mean that only some of the service objects in the affinity relationship can fit on a given node. 在这些情况下,即使存在相关性关系,也会因为其他限制而无法强制实施这种关系。In these cases even though there's an affinity relationship in place, it can't be enforced due to the other constraints. 如果可以这样做,稍后会自动纠正违规情况。If it is possible to do so, the violation is automatically corrected later.

链形与星形Chains vs. stars

目前,群集 Resource Manager 无法为链形相关性关系建模。Today the Cluster Resource Manager isn't able to model chains of affinity relationships. 这意味着,如果有一个服务是某一个相关性关系中的子级,则该服务不能是另一个相关性关系中的父级。What this means is that a service that is a child in one affinity relationship can't be a parent in another affinity relationship. 如果想要为这种关系建模,需要有效地将它建模为星形而不是链形。If you want to model this type of relationship, you effectively have to model it as a star, rather than a chain. 为了从链形转变为星形,最下面的子级会变成第一个子级的父级。To move from a chain to a star, the bottommost child would be parented to the first child's parent instead. 根据服务的排列方式,可能需要多次执行此操作。Depending on the arrangement of your services, you may have to do this multiple times. 如果没有自然的父级服务,可能需要创建一个作为预留位置。If there's no natural parent service, you may have to create one that serves as a placeholder. 根据你的需求,可能还需要查看一下应用程序组Depending on your requirements, you may also want to look into Application Groups.

相关性关系上下文中的链形与星形

目前关于相关性关系的另一个要注意的事项是,它们默认是双向的。Another thing to note about affinity relationships today is that they are directional by default. 这意味着相关性规则只强制子级放置在父级的所在之处。This means that the affinity rule only enforces that the child placed with the parent. 不能确保父级位于子级的所在之处。It does not ensure that the parent is located with the child. 因此,如果存在相关性违规并且由于某些原因无法通过将子级移动到父级节点来纠正违规行为,那么 - 即使将父级移动到子级节点可以纠正违规 - 父级也不会移动到子级节点。Therefore, if there is an affinity violation and to correct the violation for some reason it is not feasible to move the child to the parent's node, then -- even if moving the parent to the child's node would have corrected the violation -- the parent will not be moved to the child's node. 将配置 MoveParentToFixAffinityViolation 设置为 true 会消除方向性。Setting the config MoveParentToFixAffinityViolation to true would remove the directionality. 还请务必注意,相关性关系并不完美,或者无法立即强制执行,因为不同的服务具有不同的生命周期,会失败并且会单独移动。It is also important to note that the affinity relationship can't be perfect or instantly enforced since different services have with different lifecycles and can fail and move independently. 例如,假设父级由于故障突然故障转移到另一个节点。For example, let's say the parent suddenly fails over to another node because it crashed. 群集资源管理器和故障转移管理器会先处理故障转移,因为保证服务之间同步、一致和可用是优先考虑的。The Cluster Resource Manager and Failover Manager handle the failover first, since keeping the services up, consistent, and available is the priority. 故障转移完成后,相关性关系立即破裂,但群集资源管理器会认为一切都正常,直到它发现子级未与父级在一起。Once the failover completes, the affinity relationship is broken, but the Cluster Resource Manager thinks everything is fine until it notices that the child is not located with the parent. 这些种类的检查会定期执行。These sorts of checks are performed periodically. 若要深入了解群集资源管理器如何评估约束,可访问本文此文详细介绍如何配置评估这些约束的频率。More information on how the Cluster Resource Manager evaluates constraints is available in this article, and this one talks more about how to configure the cadence on which these constraints are evaluated.

分区支持Partitioning support

对于相关性,要注意的最后一点是,不支持分区父级的相关性关系。The final thing to notice about affinity is that affinity relationships aren't supported where the parent is partitioned. 可能最终会支持已分区的父服务,但目前不支持。Partitioned parent services may be supported eventually, but today it is not allowed.

后续步骤Next steps