Azure Stack Hub 计算容量Azure Stack Hub compute capacity

Azure Stack Hub 上支持的虚拟机 (VM) 大小是在 Azure 上支持的 VM 大小的子集。The virtual machine (VM) sizes supported on Azure Stack Hub are a subset of those supported on Azure. Azure 在多方面施加资源限制,以避免资源(服务器本地和服务级别)的过度消耗。Azure imposes resource limits along many vectors to avoid overconsumption of resources (server local and service-level). 如果未对租户使用资源施加一些限制,则当一些租户过度使用资源时,另一些租户的体验就会变差。Without imposing some limits on tenant consumption, the tenant experiences will suffer when other tenants overconsume resources. VM 的网络出口在 Azure Stack Hub 上有与 Azure 限制一致的带宽上限。For networking egress from the VM, there are bandwidth caps in place on Azure Stack Hub that match Azure limitations. 对于 Azure Stack Hub 上的存储资源,存储 IOPS 限制可避免租户为了访问存储而造成资源过度消耗。For storage resources on Azure Stack Hub, storage IOPS limits avoid basic over consumption of resources by tenants for storage access.

重要

Azure Stack Hub Capacity Planner 不考虑或保证 IOPS 性能。The Azure Stack Hub Capacity Planner does not consider or guarantee IOPS performance.

VM 定位VM placement

Azure Stack Hub 定位引擎跨可用的主机放置租户 VM。The Azure Stack Hub placement engine places tenant VMs across the available hosts.

放置 VM 时,Azure Stack Hub 使用两个考虑因素。Azure Stack Hub uses two considerations when placing VMs. 第一个考虑因素是主机上是否有足够的内存可供该 VM 类型使用?One, is there enough memory on the host for that VM type? 第二个考虑因素是 VM 是否是可用性集虚拟机规模集的一部分?And two, are the VMs a part of an availability set or are they virtual machine scale sets?

为了在 Azure Stack Hub 中实现多 VM 生产系统的高可用性,可以将虚拟机 (VM) 置于横跨多个容错域的可用性集中。To achieve high availability of a multi-VM production system in Azure Stack Hub, virtual machines (VMs) are placed in an availability set that spreads them across multiple fault domains. 可用性集中的容错域定义为缩放单元中的单个节点。A fault domain in an availability set is defined as a single node in the scale unit. 为了与 Azure 保持一致,Azure Stack Hub 支持的可用性集最多有三个容错域。Azure Stack Hub supports having an availability set with a maximum of three fault domains to be consistent with Azure. 置于可用性集中的 VM 在物理上是彼此隔离的,换句话说,会尽可能均衡地让其分散到多个容错域(Azure Stack Hub 主机)中。VMs placed in an availability set will be physically isolated from each other by spreading them as evenly as possible over multiple fault domains (Azure Stack Hub hosts). 如果发生硬件故障,出现故障的容错域中的 VM 将在其他容错域中重启。If there's a hardware failure, VMs from the failed fault domain will be restarted in other fault domains. 如果可能,它们将保留在同一可用性集中与其他 VM 不同的容错域中。If possible, they'll be kept in separate fault domains from the other VMs in the same availability set. 当主机重新联机时,会对 VM 重新进行均衡操作,以维持高可用性。When the host comes back online, VMs will be rebalanced to maintain high availability.

虚拟机规模集在后端使用可用性集,并确保每个虚拟机规模集实例都位于不同的容错域。Virtual machine scale sets use availability sets on the back end and make sure each virtual machine scale set instance is placed in a different fault domain. 这意味着规模集使用不同的 Azure Stack Hub 基础结构节点。This means they use separate Azure Stack Hub infrastructure nodes. 例如,在四节点 Azure Stack Hub 系统中,可能存在这样一种情况:由于缺少四节点容量,无法将三个虚拟机规模集实例放置在三个单独的 Azure Stack Hub 节点上,因此由三个实例组成的虚拟机规模集在创建时将失败。For example, in a four-node Azure Stack Hub system, there may be a situation where a virtual machine scale set of three instances will fail at creation due to the lack of the four-node capacity to place three virtual machine scale set instances on three separate Azure Stack Hub nodes. 此外,Azure Stack Hub 节点可能先在不同的级别填满,然后尝试放置。In addition, Azure Stack Hub nodes can be filled up at varying levels before trying placement.

Azure Stack Hub 不会过度提交内存。Azure Stack Hub doesn't overcommit memory. 但是,允许过度提交物理核心数。However, an overcommit of the number of physical cores is allowed.

由于放置算法不将现在的虚拟核心与物理核心的预配过度比率视为一个因素,因此每个主机可以有不同的比率。Since placement algorithms don't look at the existing virtual to physical core overprovisioning ratio as a factor, each host could have a different ratio. 对于 Azure,我们不提供有关物理-虚拟核心比率的指导,因为工作负荷和服务级别要求有差异。As Azure, we don't provide guidance on the physical-to-virtual core ratio because of the variation in workloads and service level requirements.

VM 总数注意事项Consideration for total number of VMs

准确规划 Azure Stack Hub 容量时需要考虑一个新的因素。There's a new consideration for accurately planning Azure Stack Hub capacity. 使用 1901 更新(包括之后的所有更新)时,现在可创建的 VM 总数有限制。With the 1901 update (and every update going forward), there's now a limit on the total number of VMs that can be created. 此限制是暂时性的,目的是避免解决方案不稳定。This limit is intended to be temporary to avoid solution instability. 我们已解决大量 VM 的稳定性问题根源,但尚未确定补救措施的具体时间表。The source of the stability issue at higher numbers of VMs is being addressed but a specific timeline for remediation hasn't been determined. 现在每台服务器存在 60 个 VM 的限制,解决方案总体限制为 700 个。There's now a per-server limit of 60 VMs with a total solution limit of 700. 例如,八服务器 Azure Stack Hub VM 数目限制是 480 (8 * 60)。For example, an eight-server Azure Stack Hub VM limit would be 480 (8 * 60). 对于包含 12 到 16 个服务器的 Azure Stack Hub 解决方案,限制为 700 个 VM。For a 12 to 16 server Azure Stack Hub solution, the limit would be 700. 确定此限制时,我们考虑到了所有计算容量,例如复原能力,以及运营商想要在阵列上保持的虚拟与物理 CPU 之比。This limit has been created keeping all the compute capacity considerations in mind, such as the resiliency reserve and the CPU virtual-to-physical ratio that an operator would like to maintain on the stamp. 有关详细信息,请参阅新版容量规划器。For more information, see the new release of the capacity planner.

如果达到了 VM 规模限制,将返回以下错误代码:VMsPerScaleUnitLimitExceededVMsPerScaleUnitNodeLimitExceededIf the VM scale limit is reached, the following error codes are returned as a result: VMsPerScaleUnitLimitExceeded, VMsPerScaleUnitNodeLimitExceeded.

VM 的批量部署注意事项Consideration for batch deployment of VMs

在 2002 版本以及之前的版本中,每批部署 2-5 个 VM,各批次之间间隔 5 分钟,就可以提供可靠的 VM 部署,以达到 700 个 VM 的规模。In releases prior to and including 2002, 2-5 VMs per batch with 5 mins gap in between batches provided reliable VM deployments to reach a scale of 700 VMs. 使用 2005 版本的 Azure Stack Hub,我们能够通过每批部署 50 个 VM,各批量部署之间间隔 5 分钟,稳定地预配 VM。With the 2005 version of Azure Stack Hub, we are able to reliably provision VMs at batch sizes of 50 with 5 mins gap in between batch deployments.

解除分配的注意事项Considerations for deallocation

当 VM 处于“解除分配”__ 状态时,不会使用内存资源。When a VM is in the deallocated state, memory resources aren't being used. 这允许将其他 VM 放置在系统中。This allows others VMs to be placed in the system.

如果随后再次启动已解除分配的 VM,则内存使用或分配将像放置在系统中的新 VM 一样处理,并占用可用内存。If the deallocated VM is then started again, the memory usage or allocation is treated like a new VM placed into the system and available memory is consumed.

如果没有可用内存,则 VM 将不会启动。If there's no available memory, then the VM won't start.

Azure Stack Hub 内存Azure Stack Hub memory

Azure Stack Hub 旨在让已成功预配的 VM 持续运行。Azure Stack Hub is designed to keep VMs running that have been successfully provisioned. 例如,如果某台主机由于硬件故障而脱机,Azure Stack Hub 会尝试在另一台主机上重启该 VM。For example, if a host is offline because of a hardware failure, Azure Stack Hub will attempt to restart that VM on another host. 另一个示例是 Azure Stack Hub 软件的修补和更新。A second example during patching and updating of the Azure Stack Hub software. 如果需要重新启动物理主机,系统会尝试将该主机上运行的 VM 移到解决方案中另一台可用的主机上。If there's a need to reboot a physical host, an attempt is made to move the VMs executing on that host to another available host in the solution.

这种 VM 管理或移动能够实现的前提是存在预留内存容量,允许重启或迁移发生。This VM management or movement can only be achieved if there's reserved memory capacity to allow for the restart or migration to occur. 整个主机内存中有一部分进行预留,此部分不可用于放置租户 VM。A portion of the total host memory is reserved and unavailable for tenant VM placement.

可以在管理员门户中查看饼图,其中显示了 Azure Stack Hub 中可用和已用的内存。You can review a pie chart in the administrator portal that shows the free and used memory in Azure Stack Hub. 下图显示了 Azure Stack Hub 中 Azure Stack Hub 缩放单元上的物理内存容量:The following diagram shows the physical memory capacity on an Azure Stack Hub scale unit in the Azure Stack Hub:

Azure Stack Hub 缩放单元上的物理内存容量

已用内存由多个部分组成。Used memory is made up of several components. 以下组件消耗饼了图中的已用内存部分:The following components consume the memory in the use section of the pie chart:

  • 主机 OS 的用量或预留量: 主机上的操作系统 (OS)、虚拟内存页面表、主机 OS 上运行的程序以及空间直通内存缓存使用的内存。Host OS usage or reserve: The memory used by the operating system (OS) on the host, virtual memory page tables, processes that are running on the host OS, and the Spaces Direct memory cache. 此值依赖于主机上运行中的不同 Hyper-V 进程使用的内存,因此可能有浮动。Since this value is dependent on the memory used by the different Hyper-V processes running on the host, it can fluctuate.
  • 基础结构服务: 构成 Azure Stack Hub 的基础结构 VM。Infrastructure services: The infrastructure VMs that make up Azure Stack Hub. 自 Azure Stack Hub 发行版 1904 开始,这需要将近 31 个最多使用 242 GB + (4 GB x 节点数) 的 VM。As of the 1904 release version of Azure Stack Hub, this entails ~31 VMs that take up 242 GB + (4 GB x # of nodes) of memory. 我们一直在努力让基础结构服务变得更具可伸缩性和弹性,因此基础结构服务组件的内存用量可能会有变化。The memory utilization of the infrastructure services component may change as we work on making our infrastructure services more scalable and resilient.
  • 复原功能的预留量: Azure Stack Hub 预留一部分内存,用于在单个主机故障期间保持租户可用性,以及在修补和更新期间让 VM 成功进行实时迁移。Resiliency reserve: Azure Stack Hub reserves a portion of the memory to allow for tenant availability during a single host failure as well as during patch and update to allow for successful live migration of VMs.
  • 租户 VM: Azure Stack Hub 用户创建的租户 VM。Tenant VMs: The tenant VMs created by Azure Stack Hub users. 除了运行 VM 以外,任何在结构上登陆的 VM 也消耗内存。In addition to running VMs, memory is consumed by any VMs that have landed on the fabric. 这些 VM 指处于“正在创建”或“失败”状态的 VM,或者从来宾关闭的 VM。This means that VMs in "Creating" or "Failed" state, or VMs shut down from within the guest, will consume memory. 但是,已从门户/PowerShell/CLI 使用“停止解除分配”选项来解除分配的 VM 不会消耗 Azure Stack Hub 中的内存。However, VMs that have been deallocated using the stop deallocated option from portal/powershell/cli won't consume memory from Azure Stack Hub.
  • 增值资源提供程序 (RP): 为 SQL、MySQL、应用服务等增值 RP 部署的 VM。Value-add resource providers (RPs): VMs deployed for the value-add RPs like SQL, MySQL, App Service, and so on.

在门户上了解内存消耗量的最佳方式是使用 Azure Stack Hub Capacity Planner 来查看各种工作负荷的影响。The best way to understand memory consumption on the portal is to use the Azure Stack Hub Capacity Planner to see the impact of various workloads. 以下计算方式与规划器使用的方式相同。The following calculation is the same one used by the planner.

此计算会生成一个可用于租户 VM 放置的总可用内存。This calculation results in the total available memory that can be used for tenant VM placement. 该内存容量适用于整个 Azure Stack Hub 缩放单元。This memory capacity is for the entirety of the Azure Stack Hub scale unit.

VM 放置的可用内存 = 主机总内存 - 复原保留 - 运行租户 VM 所使用的内存 - Azure Stack Hub 基础结构开销 1Available memory for VM placement = total host memory - resiliency reserve - memory used by running tenant VMs - Azure Stack Hub Infrastructure Overhead 1

复原保留 = H + R * ((N-1) * H) + V * (N-2)Resiliency reserve = H + R * ((N-1) * H) + V * (N-2)

其中:Where:

  • H = 单服务器内存的大小H = Size of single server memory
  • N = 缩放单元的大小(服务器数)N = Size of Scale Unit (number of servers)
  • R = 用于 OS 开销的操作系统预留量,在此公式中为 .152R = The operating system reserve for OS overhead, which is .15 in this formula2
  • V = 缩放单元中的最大 VMV = Largest VM in the scale unit

1 Azure Stack Hub 基础结构开销 = 242 GB + (4 GB x 节点数)。1 Azure Stack Hub Infrastructure overhead = 242 GB + (4 GB x # of nodes). 大约使用 31 个 VM 来托管 Azure Stack Hub 基础结构,总共消耗约 242 GB + (4 GB x 节点数) 的内存和 146 个虚拟核心。Approximately 31 VMs are used to host Azure Stack Hub's infrastructure and, in total, consume about 242 GB + (4 GB x # of nodes) of memory and 146 virtual cores. 使用这个数目的 VM 是为了进行必要的服务隔离,使之符合安全性、可伸缩性、服务和修补方面的要求。The rationale for this number of VMs is to satisfy the needed service separation to meet security, scalability, servicing, and patching requirements. 这种内部服务结构允许在将来引入新开发的基础结构服务。This internal service structure allows for the future introduction of new infrastructure services as they're developed.

2 操作系统针对开销的保留 = 15% (.15) 的节点内存。2 Operating system reserve for overhead = 15% (.15) of node memory. 操作系统保留值是一个估计值,具体取决于服务器的物理内存容量和常规的操作系统开销。The operating system reserve value is an estimate and will vary based on the physical memory capacity of the server and general operating system overhead.

值 V(缩放单元中的最大 VM)是动态变化的,具体取决于最大的租户 VM 内存大小。The value V, largest VM in the scale unit, is dynamically based on the largest tenant VM memory size. 例如,最大 VM 值可能是 7 GB 或 112 GB,或者是 Azure Stack Hub 解决方案中任何其他受支持的 VM 内存大小。For example, the largest VM value could be 7 GB or 112 GB or any other supported VM memory size in the Azure Stack Hub solution. 更改 Azure Stack Hub 结构上最大的 VM 会导致增大复原预留量,同时还会导致 VM 本身的内存增加。Changing the largest VM on the Azure Stack Hub fabric will result in an increase in the resiliency reserve and also to the increase in the memory of the VM itself.

常见问题Frequently Asked Questions

:我的租户部署了新的 VM,管理员门户上的容量图表要多久才显示剩余容量?Q: My tenant deployed a new VM, how long will it take for the capability chart on the administrator portal to show remaining capacity?

:容量边栏选项卡每隔 15 分钟刷新一次,请考虑到这一点。A: The capacity blade refreshes every 15 minutes, so take that into consideration.

:我的 Azure Stack Hub 上部署的 VM 数量没有变化,但我的容量却在波动。Q: The number of deployed VMs on my Azure Stack Hub hasn't changed, but my capacity is fluctuating. 为什么?Why?

:VM 放置的可用内存取决于多种因素,其中之一就是主机的 OS 预留量。A: The available memory for VM placement has multiple dependencies, one of which is the host OS reserve. 此值取决于主机上运行的不同 Hyper-V 进程所使用的内存,它不是常量值。This value is dependent on the memory used by the different Hyper-V processes running on the host, which isn't a constant value.

:租户 VM 必须处于何种状态才会消耗内存?Q: What state do tenant VMs have to be in to consume memory?

:除了运行 VM 以外,任何在结构上登陆的 VM 也消耗内存。A: In addition to running VMs, memory is consumed by any VMs that have landed on the fabric. 这意味着,处于“正在创建”或“失败”状态的 VM 会消耗内存。This means that VMs that are in a "Creating" or "Failed" state will consume memory. 已从来宾内部关闭的,而不是从门户/PowerShell/CLI 停止解除分配的 VM 也会消耗内存。VMs shut down from within the guest as opposed to stop deallocated from portal/powershell/cli will also consume memory.

:我有一个四主机 Azure Stack Hub。Q: I have a four-host Azure Stack Hub. 我的租户中有 3 个 VM,它们各自消耗 56 GB RAM (D5_v2)。My tenant has 3 VMs that consume 56 GB of RAM (D5_v2) each. 其中一个 VM 的大小调整为 112 GB RAM (D14_v2),仪表板上的可用内存报告导致容量边栏选项卡上出现 168 GB 的用量高峰。One of the VMs is resized to 112 GB RAM (D14_v2), and available memory reporting on dashboard resulted in a spike of 168 GB usage on the capacity blade. 后续将另外两个 D5_v2 VM 的大小调整为 D14_v2 时,各自只导致增加 56 GB 的 RAM。Subsequent resizing of the other two D5_v2 VMs to D14_v2 resulted in only 56 GB of RAM increase each. 为什么会这样?Why is this so?

:可用内存受 Azure Stack Hub 保留的复原预留量的影响。A: The available memory is a function of the resiliency reserve maintained by Azure Stack Hub. 复原预留量受 Azure Stack Hub 阵列上最大 VM 大小的影响。The Resiliency reserve is a function of the largest VM size on the Azure Stack Hub stamp. 最初,阵列上的最大 VM 是 56 GB 内存。At first, the largest VM on the stamp was 56 GB memory. 调整 VM 大小后,阵列上的最大 VM 将变成 112 GB 内存,这不只会增大该租户 VM 使用的内存,也会增大复原预留量。When the VM was resized, the largest VM on the stamp became 112 GB memory which not only increased the memory used by that tenant VM but also increased the resiliency reserve. 此项更改导致增加 56 GB(租户 VM 内存从 56 GB 增加到 112 GB)+ 增加 112 GB 复原预留内存。This change resulted in an increase of 56 GB (56 GB to 112 GB tenant VM memory increase) + 112 GB resiliency reserve memory increase. 后续调整 VM 大小时,最大 VM 的大小仍保持为 112 GB VM,因此没有增加任何复原预留量。When subsequent VMs were resized, the largest VM size remained as the 112 GB VM and therefore there was no resultant resiliency reserve increase. 内存消耗量的增量只是租户 VM 内存增量 (56 GB)。The increase in memory consumption was only the tenant VM memory increase (56 GB).

备注

网络方面的容量规划要求很少,因为能够配置的只有公共 VIP 的大小。The capacity planning requirements for networking are minimal as only the size of the public VIP is configurable. 若要了解如何向 Azure Stack Hub 添加更多的公共 IP 地址,请参阅添加公共 IP 地址For information about how to add more public IP addresses to Azure Stack Hub, see Add public IP addresses.

后续步骤Next steps

了解 Azure Stack Hub 存储Learn about Azure Stack Hub storage