以编程方式缩放 Service Fabric 群集Scale a Service Fabric cluster programmatically

在 Azure 中运行的 Service Fabric 群集在虚拟机规模集的基础上构建。Service Fabric clusters running in Azure are built on top of virtual machine scale sets. 群集缩放介绍如何手动缩放或使用自动缩放规则缩放 Service Fabric 群集。Cluster scaling describes how Service Fabric clusters can be scaled either manually or with auto-scale rules. 本文介绍如何使用 fluent Azure 计算 SDK(更高级的方案)管理凭据和缩小或扩大群集。This article describes how to manage credentials and scale a cluster in or out using the fluent Azure compute SDK, which is a more advanced scenario. 有关概述,请阅读以编程方式协调 Azure 缩放操作For an overview, read programmatic methods of coordinating Azure scaling operations.


本文进行了更新,以便使用新的 Azure PowerShell Az 模块。This article has been updated to use the new Azure PowerShell Az module. 你仍然可以使用 AzureRM 模块,至少在 2020 年 12 月之前,它将继续接收 bug 修补程序。You can still use the AzureRM module, which will continue to receive bug fixes until at least December 2020. 若要详细了解新的 Az 模块和 AzureRM 兼容性,请参阅新 Azure Powershell Az 模块简介To learn more about the new Az module and AzureRM compatibility, see Introducing the new Azure PowerShell Az module. 有关 Az 模块安装说明,请参阅安装 Azure PowerShellFor Az module installation instructions, see Install Azure PowerShell.

管理凭据Manage credentials

编写服务来处理缩放的难题之一是,该服务必须能够在无需交互式登录的情况下访问虚拟机规模集资源。One challenge of writing a service to handle scaling is that the service must be able to access virtual machine scale set resources without an interactive login. 如果缩放服务可修改自身的 Service Fabric 应用程序,则访问 Service Fabric 群集的过程就很轻松,但访问规模集则需要提供凭据。Accessing the Service Fabric cluster is easy if the scaling service is modifying its own Service Fabric application, but credentials are needed to access the scale set. 若要登录,可以使用在 Azure CLI 中创建的服务主体To sign in, you can use a service principal created with the Azure CLI.

可以使用以下步骤创建服务主体:A service principal can be created with the following steps:

  1. 以有权访问虚拟机规模集的用户身份登录到 Azure CLI (az login)Sign in to the Azure CLI (az login) as a user with access to the virtual machine scale set
  2. 使用 az ad sp create-for-rbac 创建服务主体Create the service principal with az ad sp create-for-rbac
    1. 记下 appId(在某些文档中称为“客户端 ID”)、名称、密码和租户供稍后使用。Make note of the appId (called 'client ID' elsewhere), name, password, and tenant for later use.
    2. 还需要准备好订阅 ID(可使用 az account list 查看)You will also need your subscription ID, which can be viewed with az account list

Fluent 计算库可以使用这些凭据进行登录,如下所示(请注意,IAzure 等核心 Fluent Azure 类型位于 Microsoft.Azure.Management.Fluent 包中):The fluent compute library can sign in using these credentials as follows (note that core fluent Azure types like IAzure are in the Microsoft.Azure.Management.Fluent package):

var credentials = new AzureCredentials(new ServicePrincipalLoginInformation {
                ClientId = AzureClientId,
                ClientSecret = 
                AzureClientKey }, AzureTenantId, AzureEnvironment.AzureGlobalCloud);
IAzure AzureClient = Azure.Authenticate(credentials).WithSubscription(AzureSubscriptionId);

if (AzureClient?.SubscriptionId == AzureSubscriptionId)
    ServiceEventSource.Current.ServiceMessage(Context, "Successfully logged into Azure");
    ServiceEventSource.Current.ServiceMessage(Context, "ERROR: Failed to login to Azure");

登录后,可通过 AzureClient.VirtualMachineScaleSets.GetById(ScaleSetId).Capacity 查询规模集实例计数。Once logged in, scale set instance count can be queried via AzureClient.VirtualMachineScaleSets.GetById(ScaleSetId).Capacity.

扩大Scaling out

使用 fluent Azure 计算 SDK,只需执行几次调用,就能将实例添加到虚拟机规模集 -Using the fluent Azure compute SDK, instances can be added to the virtual machine scale set with just a few calls -

var scaleSet = AzureClient.VirtualMachineScaleSets.GetById(ScaleSetId);
var newCapacity = (int)Math.Min(MaximumNodeCount, scaleSet.Capacity + 1);

也可以使用 PowerShell cmdlet 管理虚拟机规模集大小。Alternatively, virtual machine scale set size can also be managed with PowerShell cmdlets. Get-AzVmss 可以检索虚拟机规模集对象。Get-AzVmss can retrieve the virtual machine scale set object. 当前容量可通过 .sku.capacity 属性获得。The current capacity is available through the .sku.capacity property. 将容量更改为相应值后,可以使用 Update-AzVmss 命令更新 Azure 中的虚拟机规模集。After changing the capacity to the desired value, the virtual machine scale set in Azure can be updated with the Update-AzVmss command.

手动添加节点时,添加规模集实例应该就能启动新的 Service Fabric 节点,因为规模集模板包含相应的扩展,可将新实例自动加入 Service Fabric 群集。As when adding a node manually, adding a scale set instance should be all that's needed to start a new Service Fabric node since the scale set template includes extensions to automatically join new instances to the Service Fabric cluster.

缩减Scaling in

缩减过程类似于扩展。实际的虚拟机规模集更改几乎是相同的。Scaling in is similar to scaling out. The actual virtual machine scale set changes are practically the same. 但是,如前所述,Service Fabric 只会自动清理持久性为金级或银级的已删除节点。But, as was discussed previously, Service Fabric only automatically cleans up removed nodes with a durability of Gold or Silver. 因此,在持久性为铜级的节点中缩减时,需要与 Service Fabric 群集交互,以关闭要删除的节点,并删除其状态。So, in the Bronze-durability scale-in case, it's necessary to interact with the Service Fabric cluster to shut down the node to be removed and then to remove its state.

关闭节点的准备工作涉及查找要删除的节点(最近添加的虚拟机规模集实例)并将其停用。Preparing the node for shutdown involves finding the node to be removed (the most recently added virtual machine scale set instance) and deactivating it. 虚拟机规模集实例按其添加顺序编号,因此,可以通过比较节点名称中的数字后缀(它与基础虚拟机规模集实例名称相匹配)来查找较新的节点。Virtual machine scale set instances are numbered in the order they are added, so newer nodes can be found by comparing the number suffix in the nodes' names (which match the underlying virtual machine scale set instance names).

using (var client = new FabricClient())
    var mostRecentLiveNode = (await client.QueryManager.GetNodeListAsync())
        .Where(n => n.NodeType.Equals(NodeTypeToScale, StringComparison.OrdinalIgnoreCase))
        .Where(n => n.NodeStatus == System.Fabric.Query.NodeStatus.Up)
        .OrderByDescending(n =>
            var instanceIdIndex = n.NodeName.LastIndexOf("_");
            var instanceIdString = n.NodeName.Substring(instanceIdIndex + 1);
            return int.Parse(instanceIdString);

找到要删除的节点后,可以使用相同的 FabricClient 实例和前面的 IAzure 实例来停用并删除该节点。Once the node to be removed is found, it can be deactivated and removed using the same FabricClient instance and the IAzure instance from earlier.

var scaleSet = AzureClient.VirtualMachineScaleSets.GetById(ScaleSetId);

// Remove the node from the Service Fabric cluster
ServiceEventSource.Current.ServiceMessage(Context, $"Disabling node {mostRecentLiveNode.NodeName}");
await client.ClusterManager.DeactivateNodeAsync(mostRecentLiveNode.NodeName, NodeDeactivationIntent.RemoveNode);

// Wait (up to a timeout) for the node to gracefully shutdown
var timeout = TimeSpan.FromMinutes(5);
var waitStart = DateTime.Now;
while ((mostRecentLiveNode.NodeStatus == System.Fabric.Query.NodeStatus.Up || mostRecentLiveNode.NodeStatus == System.Fabric.Query.NodeStatus.Disabling) &&
        DateTime.Now - waitStart < timeout)
    mostRecentLiveNode = (await client.QueryManager.GetNodeListAsync()).FirstOrDefault(n => n.NodeName == mostRecentLiveNode.NodeName);
    await Task.Delay(10 * 1000);

// Decrement VMSS capacity
var newCapacity = (int)Math.Max(MinimumNodeCount, scaleSet.Capacity - 1); // Check min count 


与横向扩展一样,如果脚本方法更可取的话,也可以使用适用于修改虚拟机规模集容量的 PowerShell cmdlet。As with scaling out, PowerShell cmdlets for modifying virtual machine scale set capacity can also be used here if a scripting approach is preferable. 删除虚拟机实例后,可删除 Service Fabric 节点状态。Once the virtual machine instance is removed, Service Fabric node state can be removed.

await client.ClusterManager.RemoveNodeStateAsync(mostRecentLiveNode.NodeName);

后续步骤Next steps

要开始实现自己的自动缩放逻辑,请先熟悉以下概念和有用的 API:To get started implementing your own auto-scaling logic, familiarize yourself with the following concepts and useful APIs: