使用纯无状态节点类型部署 Azure Service Fabric 群集(预览版)Deploy an Azure Service Fabric cluster with stateless-only node types (Preview)

关于 Service Fabric 节点类型,有一个固有的假设,即在某个时间点,有状态服务可能会被放置到节点上。Service Fabric node types come with inherent assumption that at some point of time, stateful services might be placed on the nodes. 无状态节点类型放宽了对节点类型的这种假设,因此允许节点类型使用其他功能,例如更快的横向扩展操作、支持在“青铜”持续性级别下自动升级 OS,以及在一个虚拟机规模集中扩展到 100 多个节点。Stateless node types relax this assumption for a node type, thus allowing node type to use other features such as faster scale out operations, support for Automatic OS Upgrades on Bronze durability and scaling out to more than 100 nodes in a single virtual machine scale set.

  • 主节点类型不能配置为无状态Primary node types cannot be configured to be stateless
  • 无状态节点类型仅支持“青铜”持续性级别Stateless node types are only supported with Bronze Durability Levels
  • 无状态节点类型仅在 Service Fabric 运行时版本 7.1.409 或更高版本上受支持。Stateless node types are only supported on Service Fabric Runtime version 7.1.409 or above.

现提供示例模板:Service Fabric 无状态节点类型模板Sample templates are available: Service Fabric Stateless Node types template

在 Service Fabric 群集中启用无状态节点类型Enabling stateless node types in Service Fabric cluster

若要在群集资源中将一个或多个节点类型设置为无状态,请将“isStateless”属性设置为“true”。To set one or more node types as stateless in a cluster resource, set the isStateless property to "true". 使用无状态节点类型部署 Service Fabric 群集时,请记住在群集资源中至少使用一个主节点类型。When deploying a Service Fabric cluster with stateless node types, do remember to have atleast one primary node type in the cluster resource.

  • Service Fabric 群集资源 apiVersion 应为“2020-12-01-preview”或更高版本。The Service Fabric cluster resource apiVersion should be "2020-12-01-preview" or higher.
{
    "nodeTypes": [
    {
        "name": "[parameters('vmNodeType0Name')]",
        "applicationPorts": {
            "endPort": "[parameters('nt0applicationEndPort')]",
            "startPort": "[parameters('nt0applicationStartPort')]"
        },
        "clientConnectionEndpointPort": "[parameters('nt0fabricTcpGatewayPort')]",
        "durabilityLevel": "Silver",
        "ephemeralPorts": {
            "endPort": "[parameters('nt0ephemeralEndPort')]",
            "startPort": "[parameters('nt0ephemeralStartPort')]"
        },
        "httpGatewayEndpointPort": "[parameters('nt0fabricHttpGatewayPort')]",
        "isPrimary": true,
        "isStateless": false,
        "vmInstanceCount": "[parameters('nt0InstanceCount')]"
    },
    {
        "name": "[parameters('vmNodeType1Name')]",
        "applicationPorts": {
            "endPort": "[parameters('nt1applicationEndPort')]",
            "startPort": "[parameters('nt1applicationStartPort')]"
        },
        "clientConnectionEndpointPort": "[parameters('nt1fabricTcpGatewayPort')]",
        "durabilityLevel": "Bronze",
        "ephemeralPorts": {
            "endPort": "[parameters('nt1ephemeralEndPort')]",
            "startPort": "[parameters('nt1ephemeralStartPort')]"
        },
        "httpGatewayEndpointPort": "[parameters('nt1fabricHttpGatewayPort')]",
        "isPrimary": false,
        "isStateless": true,
        "vmInstanceCount": "[parameters('nt1InstanceCount')]"
    }    
    ],
}

配置虚拟机规模集,以启用无状态节点类型Configuring virtual machine scale set for stateless node types

若要启用无状态节点类型,应按以下方式配置底层虚拟机规模集资源:To enable stateless node types, you should configure the underlying virtual machine scale set resource in the following way:

  • singlePlacementGroup 属性的值应设置为 false(如果需要扩展到超过 100 个 VM) 。The value singlePlacementGroup property, which should be set to false if you require to scale to more than 100 VMs.
  • 在规模集的 upgradePolicy 中,应将“模式”设置为“滚动升级” 。The Scale set's upgradePolicy which mode should be set to Rolling.
  • 设置为滚动升级模式时,需要配置应用程序运行状况扩展或运行状况探测。Rolling Upgrade Mode requires Application Health Extension or Health probes configured. 按照以下建议,使用无状态节点类型的默认配置来配置运行状况探测。Configure health probe with default configuration for Stateless Node types as suggested below. 将应用程序部署到节点类型后,可以更改运行状况探测/运行状况扩展端口,以监视应用程序运行状况。Once applications are deployed to the node type, Health Probe/Health extension ports can be changed to monitor application health.
{
    "apiVersion": "2018-10-01",
    "type": "Microsoft.Compute/virtualMachineScaleSets",
    "name": "[parameters('vmNodeType1Name')]",
    "location": "[parameters('computeLocation')]",
    "properties": {
        "overprovision": "[variables('overProvision')]",
        "upgradePolicy": {
          "mode": "Rolling",
          "automaticOSUpgradePolicy": {
            "enableAutomaticOSUpgrade": true
          }
        }
    }
    "virtualMachineProfile": {
    "extensionProfile": {
    "extensions": [
    {
    "name": "[concat(parameters('vmNodeType1Name'),'_ServiceFabricNode')]",
    "properties": {
        "type": "ServiceFabricNode",
        "autoUpgradeMinorVersion": false,
        "publisher": "Microsoft.Azure.ServiceFabric",
        "settings": {
            "clusterEndpoint": "[reference(parameters('clusterName')).clusterEndpoint]",
            "nodeTypeRef": "[parameters('vmNodeType1Name')]",
            "dataPath": "D:\\\\SvcFab",
            "durabilityLevel": "Bronze",
            "certificate": {
                "thumbprint": "[parameters('certificateThumbprint')]",
                "x509StoreName": "[parameters('certificateStoreValue')]"
            },
            "systemLogUploadSettings": {
                "Enabled": true
            },
        },
        "typeHandlerVersion": "1.0"
    }
    },
    {
        "type": "extensions",
        "name": "HealthExtension",
        "properties": {
            "publisher": "Microsoft.ManagedServices",
            "type": "ApplicationHealthWindows",
            "autoUpgradeMinorVersion": true,
            "typeHandlerVersion": "1.0",
            "settings": {
            "protocol": "tcp",
            "port": "19000"
            }
            }
        },
    ]
}

网络要求Networking requirements

公共 IP 和负载均衡器资源Public IP and Load Balancer Resource

为了能够在虚拟机规模集资源中扩展到超过 100 个 VM,该虚拟机规模集所引用的负载均衡器和 IP 资源必须都使用标准 SKU。To enable scaling to more than 100 VMs on a virtual machine scale set resource, the load balancer and IP resource referenced by that virtual machine scale set must both be using a Standard SKU. 在未设置 SKU 属性的情况下创建负载均衡器或 IP 资源时将创建基本 SKU,而后者不支持扩展到超过 100 个 VM。Creating a load balancer or IP resource without the SKU property will create a Basic SKU, which does not support scaling to more than 100 VMs. 默认情况下,标准 SKU 负载均衡器会阻止外部的所有流量;若要允许外部流量,必须将 NSG 部署到子网。A Standard SKU load balancer will block all traffic from the outside by default; to allow outside traffic, an NSG must be deployed to the subnet.

{
    "apiVersion": "2018-11-01",
    "type": "Microsoft.Network/publicIPAddresses",
    "name": "[concat('LB','-', parameters('clusterName')]",
    "location": "[parameters('computeLocation')]",
    "sku": {
        "name": "Standard"
    }
}
{
    "apiVersion": "2018-11-01",
    "type": "Microsoft.Network/loadBalancers",
    "name": "[concat('LB','-', parameters('clusterName')]", 
    "location": "[parameters('computeLocation')]",
    "dependsOn": [
        "[concat('Microsoft.Network/networkSecurityGroups/', concat('nsg', parameters('subnet0Name')))]"
    ],
    "properties": {
        "addressSpace": {
            "addressPrefixes": [
                "[parameters('addressPrefix')]"
            ]
        },
        "subnets": [
        {
            "name": "[parameters('subnet0Name')]",
            "properties": {
                "addressPrefix": "[parameters('subnet0Prefix')]",
                "networkSecurityGroup": {
                "id": "[resourceId('Microsoft.Network/networkSecurityGroups', concat('nsg', parameters('subnet0Name')))]"
              }
            }
          }
        ]
    },
    "sku": {
        "name": "Standard"
    }
}

备注

目前不能在公共 IP 和负载均衡器资源上就地更改 SKU。It is not possible to do an in-place change of SKU on the public IP and load balancer resources. 如果要从具有基本 SKU 的现有资源进行迁移,请参阅本文的迁移部分。If you are migrating from existing resources which have a Basic SKU, see the migration section of this article.

虚拟机规模集 NAT 规则Virtual machine scale set NAT rules

负载均衡器入站 NAT 规则应匹配虚拟机规模集中的 NAT 池。The load balancer inbound NAT rules should match the NAT pools from the virtual machine scale set. 每个虚拟机规模集必须有一个唯一的入站 NAT 池。Each virtual machine scale set must have a unique inbound NAT pool.

{
"inboundNatPools": [
    {
        "name": "LoadBalancerBEAddressNatPool0",
        "properties": {
            "backendPort": "3389",
            "frontendIPConfiguration": {
                "id": "[variables('lbIPConfig0')]"
            },
            "frontendPortRangeEnd": "50999",
            "frontendPortRangeStart": "50000",
            "protocol": "tcp"
        }
    },
    {
        "name": "LoadBalancerBEAddressNatPool1",
        "properties": {
            "backendPort": "3389",
            "frontendIPConfiguration": {
                "id": "[variables('lbIPConfig0')]"
            },
            "frontendPortRangeEnd": "51999",
            "frontendPortRangeStart": "51000",
            "protocol": "tcp"
        }
    },
    {
        "name": "LoadBalancerBEAddressNatPool2",
        "properties": {
            "backendPort": "3389",
            "frontendIPConfiguration": {
                "id": "[variables('lbIPConfig0')]"
            },
            "frontendPortRangeEnd": "52999",
            "frontendPortRangeStart": "52000",
            "protocol": "tcp"
        }
    }
    ]
}

标准 SKU 负载均衡器出站规则Standard SKU Load Balancer outbound rules

与基本 SKU 相比,标准负载均衡器和标准公共 IP 为出站连接引入了新功能和不同的行为。Standard Load Balancer and Standard Public IP introduce new abilities and different behaviors to outbound connectivity when compared to using Basic SKUs. 如果在使用标准 SKU 时需要出站连接,则必须使用标准公共 IP 地址或标准公共负载均衡器显式定义它。If you want outbound connectivity when working with Standard SKUs, you must explicitly define it either with Standard Public IP addresses or Standard public Load Balancer. 有关详细信息,请参阅出站连接Azure 标准负载均衡器For more information, see Outbound connections and Azure Standard Load Balancer.

备注

标准模板引用的 NSG 默认允许所有出站流量。The standard template references an NSG which allows all outbound traffic by default. 系统仅允许 Service Fabric 管理操作所需的端口上的入站流量。Inbound traffic is limited to the ports that are required for Service Fabric management operations. 你可以根据需要对 NSG 规则进行修改。The NSG rules can be modified to meet your requirements.

备注

使用标准 SKU SLB 的任何 Service Fabric 群集都需要确保每种节点类型都有一条规则,即允许端口 443 上的出站流量。Any Service Fabric cluster making use of a Standard SKU SLB needs to ensure that each node type has a rule allowing outbound traffic on port 443. 这是完成群集设置所必需的,没有此类规则的任何部署都将失败。This is necessary to complete cluster setup, and any deployment without such a rule will fail.

从使用基本 SKU 负载均衡器和基本 SKU IP 的群集迁移为使用无状态节点类型Migrate to using Stateless node types from a cluster using a Basic SKU Load Balancer and a Basic SKU IP

对于所有迁移方案,都需要添加一个新的纯无状态节点类型。For all migration scenarios, a new stateless-only node type needs to be added. 现有节点类型不能迁移到纯无状态类型。Existing node type cannot be migrated to be stateless-only.

若要迁移使用基本 SKU 负载均衡器和 IP 的群集,必须先使用标准 SKU 创建全新的负载均衡器和 IP 资源。To migrate a cluster, which was using a Load Balancer and IP with a basic SKU, you must first create an entirely new Load Balancer and IP resource using the standard SKU. 目前无法就地更新这些资源。It is not possible to update these resources in-place.

应在要使用的新的无状态节点类型中引用新的 LB 和 IP。The new LB and IP should be referenced in the new Stateless node types that you would like to use. 上面的示例中添加了一个新的虚拟机规模集资源,用于无状态节点类型。In the example above, a new virtual machine scale set resources is added to be used for Stateless node types. 这些虚拟机规模集引用新创建的 LB 和 IP,并在 Service Fabric 群集资源中被标记为无状态节点类型。These virtual machine scale sets reference the newly created LB and IP and are marked as stateless node types in the Service Fabric Cluster Resource.

首先,需要将新资源添加到现有资源管理器模板。To begin, you will need to add the new resources to your existing Resource Manager template. 这些资源包括:These resources include:

  • 使用标准 SKU 的公共 IP 资源。A Public IP Resource using Standard SKU.
  • 使用标准 SKU 的负载均衡器资源。A Load Balancer Resource using Standard SKU.
  • 由在其中部署虚拟机规模集的子网所引用的 NSG。A NSG referenced by the subnet in which you deploy your virtual machine scale sets.

资源完成部署后,你就可以开始禁用要从原始群集中删除的节点类型中的节点。Once the resources have finished deploying, you can begin to disable the nodes in the node type that you want to remove from the original cluster.

备注

对具有青铜级持久性的无状态节点类型使用自动缩放时,在执行纵向缩减操作后,节点状态不会自动清除。While using AutoScaling with Stateless nodetypes with Bronze Durability, after scale down operation, node state is not automatically cleaned up. 若要在自动缩放期间清除已关闭节点的节点状态,建议使用 Service Fabric 自动缩放帮助程序In order to cleanup the NodeState of Down Nodes during AutoScale, using Service Fabric AutoScale Helper is advised.

后续步骤Next steps