对群集验证报告进行故障排除Troubleshoot cluster validation reporting

适用于:Azure Stack HCI 版本 20H2;Windows Server 2019Applies to: Azure Stack HCI, version 20H2; Windows Server 2019

本主题可帮助你对 Azure Stack HCI 群集中所有服务器上的网络和存储 QoS(服务质量)设置的群集验证报表进行故障排除,并验证是否已定义重要规则。This topic helps you troubleshoot cluster validation reporting for network and storage QoS (quality of service) settings across servers in an Azure Stack HCI cluster, and verify that important rules are defined. 为了获得最佳连接性和性能,群集验证过程会验证数据中心桥接 (DCB) QoS 配置是否一致,并包含适当的用于“故障转移群集”和 SMB/SMB Direct 流量类的规则(如果已定义)。For optimal connectivity and performance, the cluster validation process verifies that Data Center Bridging (DCB) QoS configuration is consistent and, if defined, contains appropriate rules for Failover Clustering and SMB/SMB Direct traffic classes.

安装数据中心桥接Install data center bridging

必须安装数据中心桥接才能使用特定于 QoS 的 cmdlet。Data Center Bridging must be installed to use QoS-specific cmdlets. 若要检查服务器上是否已安装数据中心桥接功能,请在 PowerShell 中运行以下 cmdlet:To check if the Data Center Bridging feature is already installed on a server, run the following cmdlet in PowerShell:

Get-WindowsFeature -Name Data-Center-Bridging -ComputerName Server1

如果未安装数据中心桥接,请通过在群集中的每个服务器上运行以下 cmdlet 来安装它:If Data Center Bridging is not installed, install it by running the following cmdlet on each server in the cluster:

Install-WindowsFeature –Name Data-Center-Bridging -ComputerName Server1

运行群集验证检查Run a cluster validation test

通过选择“工具”>“服务器”>“库存”>“验证群集”来使用 Windows Admin Center 中的“验证”功能,或者运行以下 PowerShell 命令:Either use the Validate feature in Windows Admin Center by selecting Tools > Servers > Inventory > Validate cluster, or run the following PowerShell command:

Test-Cluster –Node Server1, Server2

该测试会执行多项操作,其中包括验证 DCB QoS 配置是否一致,以及集群中的所有服务器是否具有相同数量的流量类和 QoS 规则。Among other things, the test will validate that DCB QoS Configuration is consistent, and that all servers in the cluster have the same number of traffic classes and QoS Rules. 它还验证所有服务器是否为“故障转移群集”和 SMB/SMB Direct 流量类定义了 QoS 规则。It will also verify that all servers have QoS rules defined for Failover Clustering and SMB/SMB Direct traffic classes.

你可以在 Windows Admin Center 中,或通过访问当前工作目录中的日志文件来查看验证报告。You can view the validation report in Windows Admin Center, or by accessing a log file in the current working directory. 例如:C:\Users<username>\AppData\Local\TempFor example: C:\Users<username>\AppData\Local\Temp\

在报表底部,你将看到“验证 QoS 设置配置”和集群中每个服务器的相应报表。Near the bottom of the report, you will see "Validate QoS Settings Configuration" and a corresponding report for each server in the cluster.

要了解服务器上已经设置了哪些流量类别,请使用 Get-NetQosTrafficClass cmdlet。To understand which traffic classes are already set on a server, use the Get-NetQosTrafficClass cmdlet.

若要了解详细信息,请参阅验证 Azure Stack HCI 群集To learn more, see Validate an Azure Stack HCI cluster.

验证网络 QoS 规则Validate networking QoS rules

验证群集中不同服务器上的 DCB 就绪状态和优先级流控制状态设置的一致性。Validate the consistency of DCB willing status and priority flow control status settings between servers in the cluster.

DCB 就绪状态DCB willing status

支持数据中心桥接功能交换协议 (DCBX) 的网络适配器可以接受来自远程设备的配置。Network adapters that support the Data Center Bridging Capability Exchange protocol (DCBX) can accept configurations from a remote device. 要启用此功能,网络适配器上的 DCB willing 比特必须设置为 True。To enable this capability, the DCB willing bit on the network adapter must be set to true. 如果“就绪比特”设置为 False,则设备将拒绝来自远程设备的所有配置尝试,并仅强制执行本地配置。If the willing bit is set to false, the device will reject all configuration attempts from remote devices and enforce only the local configurations. 如果你使用的是 RDMA over Converged Ethernet (RoCE) 适配器,那么所有服务器上的就绪比特都应设置为 False。If you're using RDMA over Converged Ethernet (RoCE) adapters, then the willing bit should be set to false on all servers.

Azure Stack HCI 集群中的所有服务器都应该以相同的方式设置 DCB 就绪比特。All servers in an Azure Stack HCI cluster should have the DCB willing bit set the same way.

使用 Set-NetQosDcbxSetting cmdlet 将 DCB 就绪比特设置为 True 或 False,如下面的示例所示:Use the Set-NetQosDcbxSetting cmdlet to set the DCB willing bit to either true or false, as in the following example:

Set-NetQosDcbxSetting –Willing $false

DCB 流控制状态DCB flow control status

如果上层协议(例如光纤通道)假定无损基础传输,基于优先级的流控制就至关重要。Priority-based flow control is essential if the upper layer protocol, such as Fiber Channel, assumes a lossless underlying transport. 可以全局或针对单个网络适配器启用或禁用 DCB 流控制。DCB flow control can be enabled or disabled either globally or for individual network adapters. 如果启用,可创建优先考虑某些应用程序流量的 QoS 策略。If enabled, it allows for the creation of QoS policies that prioritize certain application traffic.

为了使 QoS 策略在故障转移期间无缝工作,Azure Stack HCI 集群中的所有服务器都应该具有相同的流控制状态设置。In order for QoS policies to work seamlessly during failover, all servers in an Azure Stack HCI cluster should have the same flow control status settings. 如果使用的是 RoCE 适配器,必须在所有服务器上启用优先级流控制。If you're using RoCE adapters, then priority flow control must be enabled on all servers.

使用 Get-NetQosFlowControl cmdlet 获取当前的流控制配置。Use the Get-NetQosFlowControl cmdlet to get the current flow control configuration. 所有优先级默认已禁用。All priorities are disabled by default.

使用带有 -priority 参数的 Enable-NetQosFlowControlDisable-NetQosFlowControl cmdlets 来打开或关闭优先级流量控制。Use the Enable-NetQosFlowControl and Disable-NetQosFlowControl cmdlets with the -priority parameter to turn priority flow control on or off. 例如,以下命令对带有优先级 3 标记的流量启用流控制:For example, the following command enables flow control on traffic tagged with priority 3:

Enable-NetQosFlowControl –Priority 3

验证存储 QoS 规则Validate storage QoS rules

验证是否所有节点都有用于“故障转移群集”以及 SMB 或 SMB Direct 的 QoS 规则。Validate that all nodes have a QoS rule for failover clustering and for SMB or SMB Direct. 否则,可能会出现连接问题和性能问题。Otherwise, connectivity problems and performance problems may occur.

故障转移群集的 QoS 规则QoS Rule for failover clustering

如果在群集中定义了“任何”存储 QoS 规则,则应设置用于故障转移群集的 QoS 规则,否则可能会出现连接问题。If any storage QoS rules are defined in a cluster, then a QoS rule for failover clustering should be present, or connectivity problems may occur. 若要添加新的用于故障转移群集的 QoS 规则,请使用 New-NetQosPolicy cmdlet,如下面的示例所示:To add a new QoS rule for failover clustering, use the New-NetQosPolicy cmdlet as in the following example:

New-NetQosPolicy "Cluster" -IPDstPort 3343 -Priority 6

SMB 的 QoS 规则QoS rule for SMB

如果某些或所有节点定义了 QOS 规则,但 SMB 没有 QoS 规则,这可能会导致 SMB 的连接和性能问题。If some or all nodes have QOS rules defined but do not have a QOS Rule for SMB, this may cause connectivity and performance problems for SMB. 若要为 SMB 添加新的网络 QoS 规则,请使用 New-NetQosPolicy cmdlet,如下面的示例所示:To add a new network QoS rule for SMB, use the New-NetQosPolicy cmdlet as in the following example:

New-NetQosPolicy -Name "SMB" -SMB -PriorityValue8021Action 3

SMB Direct 的 QoS 规则QoS rule for SMB Direct

SMB Direct 会绕过网络堆栈,使用 RDMA 方法传输数据。SMB Direct bypasses the networking stack, instead using RDMA methods to transfer data. 如果某些或所有节点定义了 QOS 规则,但 SMB Direct 没有 QoS 规则,这可能会导致 SMB Direct 的连接和性能问题。If some or all nodes have QOS rules defined but do not have a QOS Rule for SMB Direct, this may cause connectivity and performance problems for SMB Direct. 若要为 SMB Direct 新建 QoS 策略,请发出以下命令:To create a new QoS policy for SMB Direct, issue the following commands:

New-NetQosPolicy "SMB Direct" –NetDirectPort 445 –Priority 3

后续步骤Next steps

如需相关信息,另请参阅:For related information, see also: