使用支持诊断工具排查 Azure Stack HCI 问题

适用于:Azure Stack HCI 版本 22H2

本文提供了有关下载和使用 Azure Stack HCI 支持诊断工具的信息。 该工具是一组 PowerShell 命令,用于简化常见问题的数据收集、故障排除和解决。

该工具不能替代专家知识。 如果遇到任何问题,请联系 Azure 支持以获取帮助。

优点

Azure Stack HCI 支持诊断工具使用简单的命令来识别问题,无需具有专家级的产品知识。

该工具具有以下特点:

  • 轻松安装和更新:可以使用 PowerShell 库以原生方式安装和更新,无任何额外的要求。

  • 诊断检查:根据常见问题、事件和遥测数据提供诊断检查。

  • 自动数据收集:自动收集重要数据,以便提供给 Azure 支持。

  • 定期更新:通过新的检查和实用的命令进行更新,以管理、排查和诊断 Azure Stack HCI 上的问题。

先决条件

在使用 PowerShell 模块之前,请确保满足以下条件:

  • PowerShell 库下载 Azure Stack HCI 支持诊断工具。

  • 使用本地系统上具有管理员权限的帐户将模块导入提升的 PowerShell 窗口。 有关其他信息,请参阅导入 PowerShell 模块

  • 在 Azure Stack HCI 系统的每个节点上安装模块。

安装和使用 Azure 本地支持诊断工具

以管理员身份运行 PowerShell,然后运行以下命令:

若要安装该工具,请运行以下命令:

Install-Module -Name Microsoft.AzureStack.HCI.CSSTools

要列出所有可用的诊断检查,请运行以下命令:

Invoke-AzsSupportDiagnosticCheck -ProductName <BaseSystem, Registration>

可以通过在参数 ProductName 后按 CTRL+SPACE 来运行所有诊断检查。

要使用一种预定义的收集组来收集数据,请运行以下命令:

New-AzsSupportDataBundle �Component <Component>

要检查所有数据收集组,请在 Component 参数之后按 CTRL+SPACE

要收集自己的数据集,请运行以下命令:

$ClusterCommands = @(<clusterCommand1>,<clusterCommand2>)
$nodeCommands = @(<nodeCommand1>,<nodeCommand2>)
$nodeEvents = @(<eventLogName1>,<eventLogName2>)
$nodeRegistry = @(<registryPath1>,<registryPath2>)
$nodeFolders = @(<folderPath1>,<folderPath2>)


New-AzsSupportDataBundle -ClusterCommands $clusterCommands `
-NodeCommands $nodeCommands `
-NodeEvents $nodeEvents `
-NodeRegistry $nodeRegistry `
-NodeFolders $nodeFolders `
-ComputerName @(<computerName1>,<computerName2>)

示例方案

要对 Azure Stack HCI 进行故障排除,请运行以下命令:

部署问题

要生成有关部署的详细报告,包括成功执行的步骤、跳过的步骤和错误详细信息,请运行以下命令:

Get-AzsSupportEceDeploymentDetails

更新或升级问题

Get-AzsSupportEceUpdateDetails

关于注册问题

Invoke-AzsSupportDiagnosticCheck -ProductName Registration

下面是一个注册问题的输出示例:

PS C:\temp> Invoke-AzsSupportDiagnosticCheck -ProductName Registration
Starting known issue check for Azure Stack HCI: Registration.                                                                                                       
Starting Azure Stack HCI base system validation.                                                                                                                        
Gathering information from all clustered nodes.                                                                                                                         
We are preparing to collect diagnostic information from your environment                                                                                                
We started the diagnostic data collection! This might take some time.                                                                                                   
Finished collecting diagnostic information.                                                                                                                             
====[ Validating registration state on node: HCI-N-1 ]====                                                                                                              
[Pass] [Azure Stack HCI - General registration state]                                                                                                                   
Validate that the cluster is registered
Details: Validation successfull

[Fail] [Azure Stack HCI - Azure Connection state]
Validate that the cluster is in a connected state
Details: This Azure Stack HCI node does not seem to be connected to azure. Ensure that this node is in a connected state.
Documentation: https://docs.azure.cn/azure-stack/hci/deploy/troubleshoot-hci-registration

[Pass] [Azure Arc Agent - Connection state]
Validate that the azure arc agent is connected
Details: Validation successfull

[Pass] [Azure Arc Agent - Service state]
Validate that all azure arc services are running
Details: Validation successfull

[Pass] [Azure Arc Agent - Heartbeat state]
Validate that the azure arc agent has sent out a heartbeat at least a day ago
Details: Validation successfull

[Pass] [Azure Stack HCI - Arc Agent onboarded]
Validate that all arc agent checks are passed
Details: Validation successfull

[Fail] [Validation summary]

Details: At least one node reported an invalid registration state.

We will collect log information from your envirorment.
Creating local storage container for diagnostic data.
Gathering cluster data ... this might take a while.
Cluster data collection complete.
We are preparing to collect diagnostic information from your environment
We started the diagnostic data collection! This might take some time.
Waiting for all diagnostic output to be generated and compressed ... this might take a while.
Finished collecting diagnostic information.
Starting copy of items ... this might take a while.
All items copied.
Successfully created archive C:\temp\6c5a4685-6e32-4b68-aeec-05475f8d6c6f\log-collection-RegistrationInformation07-22_06-03-2024.zip. Removing raw data C:\temp\6c5a4685-6e32-4b68-aeec-05475f8d6c6f\container.
Data collection done . Please upload the file to the Microsoft Workspace.

对于基本 Azure Stack HCI 系统问题

Invoke-AzsSupportDiagnosticCheck -ProductName BaseSystem

下面是基本系统问题的输出示例:

PS C:\temp> Invoke-AzsSupportDiagnosticCheck -ProductName BaseSystem
Starting known issue check for Azure Stack HCI: BaseSystem.
Gathering information from all clustered nodes.
We are preparing to collect diagnostic information from your environment
We started the diagnostic data collection! This might take some time.
Starting to validate cluster settings.
[Pass] [Failover Clustering - Cluster validation report contains no errors]
Validate that there are no critical errors in the cluster validation report
Details: Validation successfull

[Pass] [Failover Clustering - Cluster Networks have redundancy]
Validate that we have redundancy in clustered networks
Details: Validation successfull

[Pass] [Failover Clustering - Validation Summary]
Validate that there are no critical issues in our cluster validation report.
Details: Validation successfull

Collecting node data.
Finished collecting diagnostic information.
====[ Validating data from node: HCI-N-1 ]====
[Pass] [Windows Features - All windows features installed]
Verify that all features required for Azure Stack HCI are installed.
Details: Validation successfull

[Pass] [Validation summary]
Ensure that no other check has returned a failed state
Details: Validation successfull

之后,将创建正确连接的 Azure Stack HCI 系统所需的不同组件的全面概述。 根据此概述,可以遵循故障排除指南或联系 Azure 支持以获取帮助。

要收集数据,请参阅以下两个示例场景:

自动数据收集

New-AzsSupportDataBundle -Component OS
==== CUT ==================== CUT =======
Data collection done C:\temp\Azs.Support\XXXXXXX\SupportDataBundle-XX-XX_XX-XX-XXXX.zip . Please upload the file to the Microsoft Workspace

手动数据收集

$ClusterCommands = @()
$nodeCommands = @('Get-AzureStackHci','Get-AzureStackHCIArcIntegration','Get-ClusteredScheduledTask | fl *','systeminfo.exe')
$nodeEvents = @('system','application','Microsoft-AzureStack-HCI/Admin')
$nodeRegistry = @('HKLM:\Cluster\ArcForServers')
$nodeFolders = @('C:\Windows\Tasks\ArcforServers\','C:\ProgramData\AzureConnectedMachineAgent\Log\')

New-AzsSupportDataBundle -ClusterCommands $clusterCommands `
-NodeCommands $nodeCommands `
-NodeEvents $nodeEvents `
-NodeRegistry $nodeRegistry `
-NodeFolders $nodeFolders `
-ComputerName (Get-ClusterNode)

==== CUT ==================== CUT =======
Data collection done C:\temp\Azs.Support\XXXXXXX\SupportDataBundle-XX-XX_XX-XX-XXXX.zip . Please upload the file to the Microsoft Workspace.