更换 Azure Stack Hub 集成系统上的缩放单元节点Replace a scale unit node on an Azure Stack Hub integrated system

本文介绍更换 Azure Stack Hub 集成系统上的物理计算机(也称为缩放单元节点)的一般过程。This article describes the general process to replace a physical computer (also referred to as a scale unit node) on an Azure Stack Hub integrated system. 实际的缩放单元节点更换步骤将因原始设备制造商 (OEM) 硬件供应商而异。Actual scale unit node replacement steps will vary based on your original equipment manufacturer (OEM) hardware vendor. 请参阅供应商的现场可更换单元 (FRU) 文档来了解特定于你的系统的详细步骤。See your vendor's field replaceable unit (FRU) documentation for detailed steps that are specific to your system.

注意

固件分级对于本文中所述的操作的成功至关重要。Firmware leveling is critical for the success of the operation described in this article. 缺少此步骤可能会导致系统不稳定、性能降低、安全威胁或阻止 Azure Stack Hub 自动化部署操作系统。Missing this step can lead to system instability, performance decrease, security threads, or prevent Azure Stack Hub automation from deploying the operating system. 更换硬件时,请始终参阅硬件合作伙伴的文档,以确保应用的固件与 Azure Stack Hub 管理员门户中显示的 OEM 版本匹配。Always consult your hardware partner's documentation when replacing hardware to ensure the applied firmware matches the OEM Version displayed in the Azure Stack Hub administrator portal. 有关详细信息和合作伙伴文档的链接,请参阅更换硬件组件For more information and links to partner documentation, see Replace a hardware component.

以下流程图显示更换整个缩放单元节点的一般 FRU 过程。The following flow diagram shows the general FRU process to replace an entire scale unit node.

更换节点过程的流程图

*根据硬件的物理条件,可能不需要此操作。*This action may not be required based on the physical condition of the hardware.

备注

如果关闭操作确实失败,则建议先执行清空操作,再执行停止操作。If the shutdown operation does fail, it's recommended to use the drain operation followed by the stop operation. 有关详细信息,请参阅 Azure Stack Hub 中的缩放单元节点操作For more information, see Scale unit node actions in Azure Stack Hub.

查看警报信息Review alert information

如果缩放单元节点已关闭,你会收到以下严重警报:If a scale unit node is down, you'll receive the following critical alerts:

  • 节点未连接到网络控制器Node not connected to network controller
  • 节点不可访问,无法实现虚拟机放置Node inaccessible for virtual machine placement
  • 缩放单元节点处于脱机状态Scale unit node is offline

缩放单元节点关闭的警报列表

如果开启“缩放单元节点已脱机” 警报,警报说明会包含不可访问的缩放单元节点。If you open the Scale unit node is offline alert, the alert description contains the scale unit node that's inaccessible. 也可能会在硬件生命周期主机上运行的 OEM 特定的监视解决方案中收到其他警报。You may also receive additional alerts in the OEM-specific monitoring solution that's running on the hardware lifecycle host.

节点脱机警报的详细信息

缩放单元节点更换过程Scale unit node replacement process

提供以下步骤作为缩放单元节点更换过程的高级概述。The following steps are provided as a high-level overview of the scale unit node replacement process. 有关系统特有的详细步骤,请参阅 OEM 硬件供应商的 FRU 文档。See your OEM hardware vendor's FRU documentation for detailed steps that are specific to your system. 请勿在未参考 OEM 提供的文档的情况下按照这些步骤操作。Don't follow these steps without referring to your OEM-provided documentation.

  1. 使用关闭操作正常关闭缩放单元节点。Use the Shutdown action to gracefully shut down the scale unit node. 根据硬件的物理条件,可能不需要此操作。This action may not be required based on the physical condition of the hardware.

  2. 万一关闭操作失败,请使用清空操作使缩放单元节点进入维护模式。In the unlikely case the shutdown action fails, use the Drain action to put the scale unit node into maintenance mode. 根据硬件的物理条件,可能不需要此操作。This action may not be required based on the physical condition of the hardware.

    备注

    在任何情况下,只能同时禁用一个节点并关机,而不中断 S2D(存储空间直通)。In any case, only one node can be disabled and powered off at the same time without breaking the S2D (Storage Spaces Direct).

  3. 缩放单元节点处于维护模式后,请使用停止操作。After the scale unit node is in maintenance mode, use the Stop action. 根据硬件的物理条件,可能不需要此操作。This action may not be required based on the physical condition of the hardware.

    备注

    在关闭电源操作不起作用的罕见情况下,请改用基板管理控制器 (BMC) Web 界面。In the unlikely case that the Power off action doesn't work, use the baseboard management controller (BMC) web interface instead.

  4. 更换物理计算机。Replace the physical computer. 通常,此更换由 OEM 硬件供应商来完成。Typically, this replacement is done by your OEM hardware vendor.

  5. 使用修复操作将新的物理计算机添加到缩放单元。Use the Repair action to add the new physical computer to the scale unit.

  6. 使用到特权终结点检查虚拟磁盘修复状态Use the privileged endpoint to check the status of virtual disk repair. 利用新的数据驱动器,完整的存储修复作业可能需要数小时的时间,具体取决于系统负载和已使用的空间。With new data drives, a full storage repair job can take multiple hours depending on system load and consumed space.

  7. 修复操作完成后,验证是否已自动关闭所有活动警报。After the repair action has finished, validate that all active alerts have been automatically closed.

后续步骤Next steps

  • 若要了解如何在系统通电的情况下更换物理磁盘,请参阅更换磁盘For information about replacing a physical disk while the system is powered on, see Replace a disk.
  • 若要了解如何完成需要系统断电才能进行的硬件组件更换操作,请参阅更换硬件组件For information about replacing a hardware component that requires the system to be powered off, see Replace a hardware component.