自动缩放简介Introduction to Auto Scaling

自动扩展是 Service Fabric 的附加功能,可根据服务正在报告的负载或基于资源的使用情况来动态扩展服务。Auto scaling is an additional capability of Service Fabric to dynamically scale your services based on the load that services are reporting, or based on their usage of resources. 自动缩放提供了很大的弹性,并可实现按需配置服务的其他实例或分区。Auto scaling gives great elasticity and enables provisioning of additional instances or partitions of your service on demand. 整个自动缩放过程是自动且透明的,一旦在服务上设置策略,就无需在服务级别进行手动缩放操作。The entire auto scaling process is automated and transparent, and once you set up your policies on a service there is no need for manual scaling operations at the service level. 可在创建服务时启用自动缩放,也可在任何时候通过更新服务启用。Auto scaling can be turned on either at service creation time, or at any time by updating the service.

特定服务上的负载随时间变化是自动缩放非常实用的一个常见场景。A common scenario where auto-scaling is useful is when the load on a particular service varies over time. 例如,网关等服务可根据处理传入请求所需的资源量进行缩放。For example, a service such as a gateway can scale based on the amount of resources necessary to handle incoming requests. 我们来看看以下示例,了解这些缩放规则:Let's take a look at an example of what those scaling rules could look like:

  • 如果我的网关的所有实例平均使用两个以上的核心,则通过再添加一个实例来扩大网关服务。If all instances of my gateway are using more than two cores on average, then scale the gateway service out by adding one more instance. 每小时添加一个,但总共不要超过 7 个实例。Do this every hour, but never have more than seven instances in total.
  • 如果我的网关的所有实例平均使用少于 0.5 个的核心,则通过删除一个实例来缩小服务。If all instances of my gateway are using less than 0.5 cores on average, then scale the service in by removing one instance. 每小时删除一个,但总共不要少于 3 个实例。Do this every hour, but never have fewer than three instances in total.

容器和常规 Service Fabric 服务都支持自动缩放。Auto scaling is supported for both containers and regular Service Fabric services. 若要使用自动缩放,需要在 Service Fabric 运行时 6.2 或更高版本上运行。In order to use auto scaling, you need to be running on version 6.2 or above of the Service Fabric runtime.

本文的其余部分将介绍缩放策略、启用或禁用自动缩放的方式,并举例说明如何使用此功能。The rest of this article describes the scaling policies, ways to enable or to disable auto scaling, and gives examples on how to use this feature.

描述自动缩放Describing auto scaling

可以为 Service Fabric 群集中的每个服务定义自动扩展策略。Auto scaling policies can be defined for each service in a Service Fabric cluster. 每个缩放策略由两部分组成:Each scaling policy consists of two parts:

  • 缩放触发器 描述了将何时执行服务的缩放。Scaling trigger describes when scaling of the service will be performed. 定期检查触发器中定义的条件,确定是否应缩放服务。Conditions that are defined in the trigger are checked periodically to determine if a service should be scaled or not.

  • 缩放机制 描述了在触发时将如何执行缩放。Scaling mechanism describes how scaling will be performed when it is triggered. 机制仅适用于满足触发条件的情况。Mechanism is only applied when the conditions from the trigger are met.

目前支持的所有触发器都可以实用逻辑加载指标,也可以使用物理指标(如CPU或内存使用率)。All triggers that are currently supported work either with logical load metrics, or with physical metrics like CPU or memory usage. 无论采用哪种方式,Service Fabric 都将监视所报告的指标负载,并定期评估触发器以确定是否需要缩放。Either way, Service Fabric will monitor the reported load for the metric, and will evaluate the trigger periodically to determine if scaling is needed.

目前自动缩放的支持机制有两种。There are two mechanisms that are currently supported for auto scaling. 第一种适用于无状态服务或容器,其中通过添加或删除实例来执行自动缩放。The first one is meant for stateless services or for containers where auto scaling is performed by adding or removing instances. 对于有状态服务和无状态服务,还可以通过添加或删除服务的命名分区来执行自动缩放。For both stateful and stateless services, auto scaling can also be performed by adding or removing named partitions of the service.

备注

目前,每个服务仅支持一个缩放策略,并且每个缩放策略仅支持一个缩放触发器。Currently there is support for only one scaling policy per service, and only one scaling trigger per scaling policy.

对采用基于实例的缩放的分区负载触发器求平均值Average partition load trigger with instance based scaling

第一种类型的触发器基于无状态服务分区中实例的负载。The first type of trigger is based on the load of instances in a stateless service partition. 指标负载首先经过平滑处理,获得分区中每个实例的负载,然后将分区的所有实例上的这些值求平均。Metric loads are first smoothed to obtain the load for every instance of a partition, and then these values are averaged across all instances of the partition. 有三个因素确定何时缩放服务:There are three factors that determine when the service will be scaled:

  • 负载阈值下限是一个用于确定何时将服务缩小的值。Lower load threshold is a value that determines when the service will be scaled in . 如果分区的所有实例的平均负载低于此值,将缩小该服务。If the average load of all instances of the partitions is lower than this value, then the service will be scaled in.
  • 负载阈值上限是一个用于确定何时将服务扩大的值。如果分区的所有实例的平均负载高于此值,将扩大该服务。Upper load threshold is a value that determines when the service will be scaled out . If the average load of all instances of the partition is higher than this value, then the service will be scaled out.
  • 缩放间隔确定检查触发器的频率。Scaling interval determines how often the trigger will be checked. 一旦检查触发器,如果需要缩放,则将应用该机制。Once the trigger is checked, if scaling is needed the mechanism will be applied. 如果不需要缩放,则不会采取任何操作。If scaling is not needed, then no action will be taken. 在这两种情况下,缩放间隔再次到期之前,不会再检查触发器。In both cases, trigger will not be checked again before scaling interval expires again.

该触发器只能用于无状态服务(无状态容器或 Service Fabric 服务)。This trigger can be used only with stateless services (either stateless containers or Service Fabric services). 在服务有多个分区的情况下,将分别为每个分区评估触发器,并且每个分区将独立应用指定的机制。In case when a service has multiple partitions, the trigger is evaluated for each partition separately, and each partition will have the specified mechanism applied to it independently. 因此,在这种情况下,根据分区的负载,可能会将服务的某些分区扩大,将某些分区会缩小,同时可能不会缩放某些分区。Hence, in this case, it is possible that some of the partitions of the service will be scaled out, some will be scaled in, and some won't be scaled at all at the same time, based on their load.

此触发器可使用的唯一机制是 PartitionInstanceCountScaleMechanism。The only mechanism that can be used with this trigger is PartitionInstanceCountScaleMechanism. 有三个因素确定如何应用此机制:There are three factors that determine how this mechanism is applied:

  • 缩放增量确定触发机制时将添加或删除多少个实例。Scale Increment determines how many instances will be added or removed when mechanism is triggered.
  • 最大实例计数定义了缩放的上限。Maximum Instance Count defines the upper limit for scaling. 如果分区的实例数量达到此限制,则无论负载如何,都不会扩大服务。If number of instances of the partition reaches this limit, then the service will not be scaled out, regardless of the load. 可以通过指定值 -1 来忽略此限制,在这种情况下,服务将尽可能扩大(限制是群集中可用的节点数)。It is possible to omit this limit by specifying value of -1, and in that case the service will be scaled out as much as possible (the limit is the number of nodes that are available in the cluster).
  • 最小实例计数定义了缩放的下限。Minimum Instance Count defines the lower limit for scaling. 如果分区的实例数量达到此限制,则无论负载如何,都不会缩小服务。If number of instances of the partition reaches this limit, then service will not be scaled in regardless of the load.

为基于实例的缩放设置自动缩放策略Setting auto scaling policy for instance based scaling

使用应用程序清单Using application manifest

<LoadMetrics>
<LoadMetric Name="MetricB" Weight="High"/>
</LoadMetrics>
<ServiceScalingPolicies>
<ScalingPolicy>
    <AveragePartitionLoadScalingTrigger MetricName="MetricB" LowerLoadThreshold="1" UpperLoadThreshold="2" ScaleIntervalInSeconds="100"/>
    <InstanceCountScalingMechanism MinInstanceCount="3" MaxInstanceCount="4" ScaleIncrement="1"/>
</ScalingPolicy>
</ServiceScalingPolicies>

使用 C# APIUsing C# APIs

FabricClient fabricClient = new FabricClient();
StatelessServiceDescription serviceDescription = new StatelessServiceDescription();
//set up the rest of the ServiceDescription
AveragePartitionLoadScalingTrigger trigger = new AveragePartitionLoadScalingTrigger();
PartitionInstanceCountScaleMechanism mechanism = new PartitionInstanceCountScaleMechanism();
mechanism.MaxInstanceCount = 3;
mechanism.MinInstanceCount = 1;
mechanism.ScaleIncrement = 1;
trigger.MetricName = "servicefabric:/_CpuCores";
trigger.ScaleInterval = TimeSpan.FromMinutes(20);
trigger.LowerLoadThreshold = 1.0;
trigger.UpperLoadThreshold = 2.0;
ScalingPolicyDescription policy = new ScalingPolicyDescription(mechanism, trigger);
serviceDescription.ScalingPolicies.Add(policy);
//as we are using scaling on a resource this must be exclusive service
//also resource monitor service needs to be enabled
serviceDescription.ServicePackageActivationMode = ServicePackageActivationMode.ExclusiveProcess
await fabricClient.ServiceManager.CreateServiceAsync(serviceDescription);

使用 PowershellUsing Powershell

$mechanism = New-Object -TypeName System.Fabric.Description.PartitionInstanceCountScaleMechanism
$mechanism.MinInstanceCount = 1
$mechanism.MaxInstanceCount = 6
$mechanism.ScaleIncrement = 2
$trigger = New-Object -TypeName System.Fabric.Description.AveragePartitionLoadScalingTrigger
$trigger.MetricName = "servicefabric:/_CpuCores"
$trigger.LowerLoadThreshold = 0.3
$trigger.UpperLoadThreshold = 0.8
$trigger.ScaleInterval = New-TimeSpan -Minutes 10
$scalingpolicy = New-Object -TypeName System.Fabric.Description.ScalingPolicyDescription
$scalingpolicy.ScalingMechanism = $mechanism
$scalingpolicy.ScalingTrigger = $trigger
$scalingpolicies = New-Object 'System.Collections.Generic.List[System.Fabric.Description.ScalingPolicyDescription]'
$scalingpolicies.Add($scalingpolicy)
#as we are using scaling on a resource this must be exclusive service
#also resource monitor service needs to be enabled
Update-ServiceFabricService -Stateless -ServiceName "fabric:/AppName/ServiceName" -ScalingPolicies $scalingpolicies

对采用基于分区的缩放的服务负载触发器求平均值Average service load trigger with partition based scaling

第二种触发器基于一个服务所有分区的负载。The second trigger is based on the load of all partitions of one service. 指标负载首先经过平滑处理,获得分区中每个副本或实例的负载。Metric loads are first smoothed to obtain the load for every replica or instance of a partition. 对于有状态服务,分区的负载被认为是主要副本的负载,而对于无状态服务,分区的负载是分区的所有实例的平均负载。For stateful services, the load of the partition is considered to be the load of the primary replica, while for stateless services the load of the partition is the average load of all instances of the partition. 这些值在服务的所有分区中取平均值,并且此值用于触发自动缩放。These values are averaged across all partitions of the service, and this value is used to trigger the auto scaling. 与之前的机制相同,有三个因素确定将何时缩放服务:Same as in previous mechanism, there are three factors that determine when the service will be scaled:

  • 负载阈值下限是一个用于确定何时将服务缩小的值。Lower load threshold is a value that determines when the service will be scaled in . 如果服务的所有分区的平均负载低于此值,则将缩小该服务。If the average load of all partitions of the service is lower than this value, then the service will be scaled in.
  • 负载阈值上限是一个用于确定何时将服务扩大的值。如果服务的所有分区的平均负载高于此值,则将扩大该服务。Upper load threshold is a value that determines when the service will be scaled out . If the average load of all partitions of the service is higher than this value, then the service will be scaled out.
  • 缩放间隔确定检查触发器的频率。Scaling interval determines how often the trigger will be checked. 一旦检查触发器,如果需要缩放,则将应用该机制。Once the trigger is checked, if scaling is needed the mechanism will be applied. 如果不需要缩放,则不会采取任何操作。If scaling is not needed, then no action will be taken. 在这两种情况下,缩放间隔再次到期之前,不会再检查触发器。In both cases, trigger will not be checked again before scaling interval expires again.

此触发器既可用于有状态服务,也可用于无状态服务。This trigger can be used both with stateful and stateless services. 此触发器唯一可以使用的机制是 AddRemoveIncrementalNamedPartitionScalingMechanism。The only mechanism that can be used with this trigger is AddRemoveIncrementalNamedPartitionScalingMechanism. 扩大服务时,则添加新的分区;缩小服务时,则删除一个现有分区。When service is scaled out then a new partition is added, and when service is scaled in one of existing partitions is removed. 在创建或更新服务时会检查一些限制,如果不满足以下条件,则服务创建/更新将失败:There are restrictions that will be checked when service is created or updated and service creation/update will fail if these conditions are not met:

  • 命名分区方案必须用于服务。Named partition scheme must be used for the service.
  • 分区名称必须是连续的整数数值,如“0”、“1”、...Partition names must be consecutive integer numbers, like "0", "1", ...
  • 第一个分区名称必须是“0”。First partition name must be "0".

例如,如果服务最初由三个分区创建,则唯一可能有效的分区名称为“0”、“1”和“2”。For example, if a service is initially created with three partitions, the only valid possibility for partition names is "0", "1" and "2".

实际执行的自动缩放操作也将遵守此命名方案:The actual auto scaling operation that is performed will respect this naming scheme as well:

  • 如果将服务的当前分区命名为“0”、“1”和“2”,那么将添加用于扩大的分区名为“3”。If current partitions of the service are named "0", "1" and "2", then the partition that will be added for scaling out will be named "3".
  • 如果将服务的当前分区命名为“0”、“1”和“2”,那么将删除用于缩小的分区名为“2”。If current partitions of the service are named "0", "1" and "2", then the partition that will be removed for scaling in is partition with name "2".

与通过添加或删除实例使用缩放的机制相同,有三个参数确定如何应用此机制:Same as with mechanism that uses scaling by adding or removing instances, there are three parameters that determine how this mechanism is applied:

  • 缩放增量确定触发机制时将添加或删除多少个分区。Scale Increment determines how many partitions will be added or removed when mechanism is triggered.
  • 最大分区计数定义了缩放的上限。Maximum Partition Count defines the upper limit for scaling. 如果服务的分区数量达到此限制,则无论负载如何,都不会扩大服务。If number of partitions of the service reaches this limit, then the service will not be scaled out, regardless of the load. 可以通过指定值 -1 来忽略此限制,在这种情况下,服务将尽可能扩大(限制是群集的实际容量)。It is possible to omit this limit by specifying value of -1, and in that case the service will be scaled out as much as possible (the limit is the actual capacity of the cluster).
  • 最小实例计数定义了缩放的下限。Minimum Instance Count defines the lower limit for scaling. 如果服务的分区数量达到此限制,则无论负载如何,都不会缩小服务。If number of partitions of the service reaches this limit, then service will not be scaled in regardless of the load.

警告

当 AddRemoveIncrementalNamedPartitionScalingMechanism 与有状态服务一起使用时,Service Fabric 将添加或删除分区, 而不会发出通知或警告When AddRemoveIncrementalNamedPartitionScalingMechanism is used with stateful services, Service Fabric will add or remove partitions without notification or warning . 触发缩放机制时,不会执行数据的重新分区。Repartitioning of data will not be performed when scaling mechanism is triggered. 在进行横向扩展操作的情况下,新分区将为空;在进行横向缩减操作的情况下,分区将与其包含的所有数据一起被删除。In case of scale out operation, new partitions will be empty, and in case of scale in operation, partition will be deleted together with all the data that it contains .

为基于分区的缩放设置自动缩放策略Setting auto scaling policy for partition based scaling

使用应用程序清单Using application manifest

<NamedPartition>
    <Partition Name="0" />
</NamedPartition>
<ServiceScalingPolicies>
    <ScalingPolicy>
        <AverageServiceLoadScalingTrigger MetricName="servicefabric:/_MemoryInMB" LowerLoadThreshold="300" UpperLoadThreshold="500" ScaleIntervalInSeconds="600"/>
        <AddRemoveIncrementalNamedPartitionScalingMechanism MinPartitionCount="1" MaxPartitionCount="3" ScaleIncrement="1"/>
    </ScalingPolicy>
</ServiceScalingPolicies>

使用 C# APIUsing C# APIs

FabricClient fabricClient = new FabricClient();
StatefulServiceUpdateDescription serviceUpdate = new StatefulServiceUpdateDescription();
AveragePartitionLoadScalingTrigger trigger = new AverageServiceLoadScalingTrigger();
PartitionInstanceCountScaleMechanism mechanism = new AddRemoveIncrementalNamedPartitionScalingMechanism();
mechanism.MaxPartitionCount = 4;
mechanism.MinPartitionCount = 1;
mechanism.ScaleIncrement = 1;
//expecting that the service already has metric NumberOfConnections
trigger.MetricName = "NumberOfConnections";
trigger.ScaleInterval = TimeSpan.FromMinutes(15);
trigger.LowerLoadThreshold = 10000;
trigger.UpperLoadThreshold = 20000;
ScalingPolicyDescription policy = new ScalingPolicyDescription(mechanism, trigger);
serviceUpdate.ScalingPolicies = new List<ScalingPolicyDescription>;
serviceUpdate.ScalingPolicies.Add(policy);
await fabricClient.ServiceManager.UpdateServiceAsync(new Uri("fabric:/AppName/ServiceName"), serviceUpdate);

使用 PowershellUsing Powershell

$mechanism = New-Object -TypeName System.Fabric.Description.AddRemoveIncrementalNamedPartitionScalingMechanism
$mechanism.MinPartitionCount = 1
$mechanism.MaxPartitionCount = 3
$mechanism.ScaleIncrement = 2
$trigger = New-Object -TypeName System.Fabric.Description.AverageServiceLoadScalingTrigger
$trigger.MetricName = "servicefabric:/_MemoryInMB"
$trigger.LowerLoadThreshold = 5000
$trigger.UpperLoadThreshold = 10000
$trigger.ScaleInterval = New-TimeSpan -Minutes 25
$scalingpolicy = New-Object -TypeName System.Fabric.Description.ScalingPolicyDescription
$scalingpolicy.ScalingMechanism = $mechanism
$scalingpolicy.ScalingTrigger = $trigger
$scalingpolicies = New-Object 'System.Collections.Generic.List[System.Fabric.Description.ScalingPolicyDescription]'
$scalingpolicies.Add($scalingpolicy)
#as we are using scaling on a resource this must be exclusive service
#also resource monitor service needs to be enabled
New-ServiceFabricService -ApplicationName $applicationName -ServiceName $serviceName -ServiceTypeName $serviceTypeName -Stateful -TargetReplicaSetSize 3 -MinReplicaSetSize 2 -HasPersistedState true -PartitionNames @("0","1") -ServicePackageActivationMode ExclusiveProcess -ScalingPolicies $scalingpolicies

基于资源的自动缩放Auto scaling based on resources

为了使资源监视器服务根据实际资源进行缩放In order to enable the resource monitor service to scale based on actual resources

"fabricSettings": [
...      
],
"addonFeatures": [
    "ResourceMonitorService"
],

有两个指标表示实际的物理资源。There are two metrics that represent actual physical resources. 一个是 servicefabric:/_CpuCores,表示实际 CPU 使用情况(0.5 表示半个核心),另一个是 servicefabric:/_MemoryInMB,表示内存使用情况(以 MB 为单位)。One of them is servicefabric:/_CpuCores which represent the actual cpu usage (so 0.5 represents half a core) and the other being servicefabric:/_MemoryInMB which represents the memory usage in MBs. ResourceMonitorService 负责跟踪用户服务的 CPU 和内存使用情况。ResourceMonitorService is responsible for tracking cpu and memory usage of user services. 这项服务将应用加权移动平均来解释潜在的短暂高峰。This service will apply weighted moving average in order to account for potential short-lived spikes. Windows 上的容器化和非容器化应用程序,以及 Linux 上的容器化应用程序支持资源监视。Resource monitoring is supported for both containerized and non-containerized applications on Windows and for containerized ones on Linux. 仅为独占进程模型中激活的服务启用资源自动缩放。Auto scaling on resources is only enabled for services activated in exclusive process model.

后续步骤Next steps

了解有关应用程序可伸缩性的详细信息。Learn more about application scalability.