对“Windows 停止错误”进行排除故障 - 目录服务初始化失败Troubleshoot Windows stop error - directory service initialization failure

本文提供了解决 Azure 中的 Active Directory 域控制器虚拟机 (VM) 陷入循环并声明其需要重启的问题。This article provides steps to resolve issues where an Active Directory domain controller virtual machine (VM) in Azure, is stuck in a loop and states that it needs to restart.

症状Symptom

当使用启动诊断查看 VM 的屏幕截图时,屏幕截图显示 VM 因错误而需要重启,并在 Windows Server 2008 R2 中显示停止代码“0xC00002E1”,在 Windows Server 2012 或更高版本中显示停止代码“0xC00002E2” 。When you use Boot diagnostics to view the screenshot of the VM, the screenshot shows that the VM needs to restart because of an error, displaying the stop code 0xC00002E1 in Windows Server 2008 R2, or 0xC00002E2 in Windows Server 2012 or later.

Windows Server 2012 启动屏幕显示“你的电脑遇到问题,需要重启。

原因Cause

错误代码“0xC00002E2”表示状态“STATUS_DS_INIT_FAILURE”,错误代码“0xC00002E1”表示状态“STATUS_DS_CANT_START” 。Error code 0xC00002E2 represents STATUS_DS_INIT_FAILURE, and error code 0xC00002E1 represents STATUS_DS_CANT_START. 当目录服务出现问题时,这两个错误都会发生。Both errors occur when there's an issue with the directory service.

当 OS 启动时,本地安全身份验证服务器 (LSASS.exe) 会强制它自动重新启动,该服务器用于对用户登录进行身份验证。As the OS boots up, it's then forced to restart automatically by the Local Security Authentication Server (LSASS.exe), which authenticates user logins. 如果 VM 上的操作系统是对其本地 Active Directory 数据库没有读/写访问权限的域控制器,则无法进行身份验证。Authentication can't happen when the operating system on the VM is a domain controller that doesn't have read/write access to its local Active Directory database. 因为无法访问 Active Directory (AD),LSASS.exe 无法进行身份验证,因此会强制重启 OS。Because of a lack of access to Active Directory (AD), LSASS.exe can't authenticate, and it's forced to restart the OS.

此错误可能由以下任何情况引起:This error can be caused by any of the following conditions:

  • 无法访问托管本地 AD 数据库的磁盘 (NTDS.DIT)。There's no access to the disk holding the local AD database (NTDS.DIT).
  • 托管本地 AD 数据库的磁盘 (NTDS.DIT) 已无可用空间。The disk holding the local AD database (NTDS.DIT) has run out of free space.
  • 本地 AD 数据库 (NTDS.DIT) 文件缺失。The local AD database (NTDS.DIT) file is missing.
  • VM 有多个磁盘,存储区域网络 (SAN) 策略配置不正确。The VM has multiple disks and the Storage Area Network (SAN) policy is configured improperly. SAN 策略未设置为“ONLINEALL”,并且非 OS 磁盘在磁盘管理器上以脱机模式连接。The SAN policy isn't set to ONLINEALL, and the non-OS disks are attached in offline mode on the disk manager.
  • 本地 AD 数据库 (NTDS.DIT) 文件已损坏。The local AD database (NTDS.DIT) file is corrupt.

解决方案Solution

过程概述:Process overview:

  1. 创建和访问修复 VM。Create and Access a Repair VM.
  2. 磁盘上的可用空间。Free space on disk.
  3. 检查确认是否已附加包含 AD 数据库的驱动器。Check that the drive containing the AD database is attached.
  4. 目录服务还原模式。Enable Directory Services Restore Mode.
  5. 建议:在重建 VM 之前,启用串行控制台和内存转储收集。Recommended: Before you rebuild the VM, enable serial console and memory dump collection.
  6. 重新生成 VM。Rebuild the VM.
  7. 重新配置 SAN 策略。Reconfigure the SAN Policy.

备注

遇到此错误时,来宾操作系统无法正常运行。When encountering this error, the Guest OS isn't operational. 需要在脱机模式下进行故障排除以解决此问题。You will be troubleshooting in offline mode to resolve the issue.

创建和访问修复 VMCreate and access a repair VM

  1. 使用 VM 修复命令的步骤 1-3 来准备一个修复 VM。Use steps 1-3 of the VM Repair Commands to prepare a Repair VM.
  2. 使用远程桌面连接来连接到修复 VM。Using Remote Desktop Connection connect to the Repair VM.

释放磁盘上的空间Free up space on disk

由于磁盘现在连接到修复 VM,请验证托管 Active Directory 内部数据库的磁盘是否有足够的空间来正确执行操作。As the disk is now attached to a repair VM, verify that the disk holding the Active Directory internal database has enough space to perform correctly.

  1. 右键单击该驱动器并选择“属性”,检查磁盘是否已满。Check whether the disk is full by right-clicking on the drive and selecting Properties.

  2. 如果磁盘的可用空间少于 300 Mb,请使用 PowerShell 将其最大扩展到 1 TbIf the disk has less than 300 Mb of free space, expand it to a maximum of 1 Tb using PowerShell.

  3. 如果磁盘已用空间达到 1 Tb,请清理磁盘。If the disk has reached 1 Tb of used space, perform a disk cleanup.

    1. 使用 PowerShell 从损坏的 VM 中分离数据磁盘Use PowerShell to detach the data disk from the broken VM.
    2. 从损坏的 VM 上分离数据磁盘后,附加数据磁盘到正常运行的 VM。Once detached from the broken VM, attach the data disk to a functioning VM.
    3. 使用磁盘清理工具释放更多空间。Use the Disk Cleanup tool to free up additional space.
  4. 可选 - 如果需要更多空间,请打开 CMD 实例并输入 defrag <LETTER ASSIGNED TO THE OS DISK>: /u /x /g 命令以在驱动器上执行碎片整理:Optional - If more space is needed, open a CMD instance and enter the defrag <LETTER ASSIGNED TO THE OS DISK>: /u /x /g command to perform a de-fragmentation on the drive:

    • 在命令中,用 OS 磁盘号替换 <LETTER ASSIGNED TO THE OS DISK>In the command, replace <LETTER ASSIGNED TO THE OS DISK> with the OS Disk's letter. 例如,如果磁盘号为 F:,则命令为 defrag F: /u /x /gFor example, if the disk letter is F:, then the command would be defrag F: /u /x /g.

    • 根据碎片级别,碎片整理可能需要几个小时。Depending upon the level of fragmentation, the de-fragmentation could take hours.

如果磁盘上有足够的空间,请继续执行下一个任务。If there's enough space on the disk, continue to the next task.

检查包含 Active Directory 数据库的驱动器是否已附加Check that the drive containing the Active Directory database is attached

  1. 打开权限提升的 CMD 实例,并运行以下命令:Open an elevated CMD instance and run the following commands:

    1. 加载注册表文件:Load registry file:

      REG LOAD HKLM\BROKENSYSTEM f:\windows\system32\config\SYSTEM

      名称 f: 指示磁盘是驱动器 F:The designation f: assumes that the disk is drive F:. 使用包含 OS 磁盘的驱动器的驱动器号。Use the drive letter belonging to the drive containing the OS disk.

    2. 确定“NTDS.DIT”的驱动器号和文件夹:Determine the drive letter and folder of NTDS.DIT:

      REG QUERY "HKLM\BROKENSYSTEM\ControlSet001\Services\NTDS\parameters" /v "DSA Working Directory"
      REG QUERY "HKLM\BROKENSYSTEM\ControlSet001\Services\NTDS\parameters" /v "DSA Database file"
      REG QUERY "HKLM\BROKENSYSTEM\ControlSet001\Services\NTDS\parameters" /v "Database backup path"
      REG QUERY "HKLM\BROKENSYSTEM\ControlSet001\Services\NTDS\parameters" /v "Database log files path"
      
    3. 卸载注册表文件:Unload registry file:

      REG UNLOAD HKLM\BROKENSYSTEM

  2. 使用 Azure 门户,验证是否已将设置 NTDS.DIT 的驱动器添加到 VM。Using Azure portal, verify that the drive where NTDS.DIT is set up, is added to the VM.

  3. 使用来宾 OS 中的磁盘管理控制台,验证包含 NTDS.DIT 的磁盘是否为联机状态。Using the Disk Management console from the guest OS, verify that the disk containing NTDS.DIT is online.

    1. 磁盘管理工具可以在“管理工具”>“计算机管理”>“存储”中找到,也可以使用 CMD 实例中的 diskmgmt.msc 命令访问。The Disk Management tool can be found in Administrative Tools > Computer Management > Storage, or may be accessed using the diskmgmt.msc command in a CMD instance.
  4. 如果磁盘未连接到 VM,请重新连接数据磁盘以解决此问题。If the disk isn't attached to the VM, reattach the data disk to fix the issue.

    如果磁盘已正常附加,则继续执行下一个任务。If the disk was attached normally, continue with the next task.

目录服务还原模式Enable Directory Services Restore Mode

将 VM 设置为在“目录服务还原模式 (DSRM)”模式下启动,以在启动期间跳过检查 NTDS.DIT 文件是否存在。Set up the VM to boot on Directory Services Restore Mode (DSRM) mode to bypass checking the existence of the NTDS.DIT file during boot.

  1. 继续操作之前,请确认已完成之前将磁盘附加到修复虚拟机的任务,并已确定 NTDS.DIT 文件位于哪个磁盘中。Before you continue, verify that you've completed the previous tasks to attach the disk to a repair VM, and have determined which disk the NTDS.DIT file is located in.

  2. 使用提升的 CMD 实例,列出该存储上的启动分区信息,以便从活动分区中查找标识符:Using an elevated CMD instance, list the booting partition info on that store to find the identifier from the active partition:

    bcdedit /store <Drive Letter>:\boot\bcd /enum

    用前面步骤中确定的号替换 < Drive Letter >Replace < Drive Letter > with the letter determined in the previous steps.

    屏幕截图显示了在输入“bcdedit /store <Drive Letter>:\boot\bcd /enum”命令后提升的 CMD 实例,该命令显示带有标识符的 Windows 启动管理器。

  3. 在启动分区上启用 safeboot DsRepair 标志:Enable the safeboot DsRepair flag on the booting partition:

    bcdedit /store <Drive Letter>:\boot\bcd /set {<Identifier>} safeboot dsrepair

    < Drive Letter >< Identifier > 替换为在前面步骤中确定的值。Replace < Drive Letter > and < Identifier > with the values determined in the previous steps.

  4. 再次查询启动选项以确保正确设置了更改。Query the booting options again to ensure that your change was properly set.

    屏幕截图显示启用 safeboot DsRepair 标志后提升的 CMD 实例。

为启用内存转储收集和串行控制台,请打开提升的命令提示符会话(以管理员身份运行)并运行以下脚本,然后运行以下命令。To enable memory dump collection and Serial Console, run the following script by opening an elevated command prompt session (Run as administrator), and run the following commands.

  1. 启用串行控制台:Enable the Serial Console:

    bcdedit /store <VOLUME LETTER WHERE THE BCD FOLDER IS>:\boot\bcd /ems {<BOOT LOADER IDENTIFIER>} ON
    bcdedit /store <VOLUME LETTER WHERE THE BCD FOLDER IS>:\boot\bcd /emssettings EMSPORT:1 EMSBAUDRATE:115200
    
  2. 验证 OS 磁盘上的可用空间是否至少等于 VM 上的内存大小 (RAM)。Verify that the free space on the OS disk is at least equal to the memory size (RAM) on the VM.

    1. 如果 OS 磁盘上没有足够的空间,请更改将要创建内存转储文件的位置,并将其引用到具有足够可用空间的 VM 上附加的任何数据磁盘。If there's not enough space on the OS disk, change the location where the memory dump file will be created, and refer that to any data disk attached to the VM that has enough free space.

      若要更改位置,请在以下命令中将 %SystemRoot% 替换为数据磁盘的驱动器号(例如,F:)。To change the location, replace %SystemRoot% with the drive letter (such as, F:) of the data disk in the following commands.

    建议使用以下配置启用 OS 转储:The following configuration is suggested to enable OS dump:

    加载损坏的 OS 磁盘Load Broken OS Disk:

    REG LOAD HKLM\BROKENSYSTEM <VOLUME LETTER OF BROKEN OS DISK>:\windows\system32\config\SYSTEM

    在 ControlSet001 上启用Enable on ControlSet001:

    REG ADD "HKLM\BROKENSYSTEM\ControlSet001\Control\CrashControl" /v CrashDumpEnabled /t REG_DWORD /d 1 /f
    REG ADD "HKLM\BROKENSYSTEM\ControlSet001\Control\CrashControl" /v DumpFile /t REG_EXPAND_SZ /d "%SystemRoot%\MEMORY.DMP" /f
    REG ADD "HKLM\BROKENSYSTEM\ControlSet001\Control\CrashControl" /v NMICrashDump /t REG_DWORD /d 1 /f
    

    在 ControlSet002 上启用Enable on ControlSet002:

    REG ADD "HKLM\BROKENSYSTEM\ControlSet002\Control\CrashControl" /v CrashDumpEnabled /t REG_DWORD /d 1 /f
    REG ADD "HKLM\BROKENSYSTEM\ControlSet002\Control\CrashControl" /v DumpFile /t REG_EXPAND_SZ /d "%SystemRoot%\MEMORY.DMP" /f
    REG ADD "HKLM\BROKENSYSTEM\ControlSet002\Control\CrashControl" /v NMICrashDump /t REG_DWORD /d 1 /f
    

    卸载损坏的 OS 磁盘Unload Broken OS Disk:

    REG UNLOAD HKLM\BROKENSYSTEM

重新生成 VMRebuild the VM

  1. 使用 VM 修复命令的步骤 5 重新装配 VM。Use step 5 of the VM Repair Commands to reassemble the VM.

重新配置存储区域网络策略Reconfigure the Storage Area Network policy

  1. 在 DSRM 模式下启动时,唯一可登录的用户是恢复管理员,在 VM 升级为域控制器时使用过该用户。When booting in DSRM mode, the only user available to log in is the recovery administrator, which was used when the VM was promoted to a domain controller. 如果使用其他用户,将显示身份验证错误。All other users will show an authentication error.

    1. 如果没有其他可用的 DC,则必须使用 .\administratormachinename\administrator 和 DSRM 密码在本地登录。If no other DC is available, you must log in locally using .\administrator or machinename\administrator and the DSRM password.
  2. 设置 SAN 策略,使所有磁盘都处于联机状态。Set up the SAN policy so that all the disks are online.

    1. 打开权限提升的 CMD 实例,并输入 DISKPARTOpen an elevated CMD instance and enter DISKPART.

    2. 查询磁盘列表。Query for the list of the disks.

      DISKPART> list disk

    3. 输入以下命令以选择需要联机的磁盘并更改 SAN 策略:Enter the following commands to select the disk that needs to be brought online and change the SAN policy:

      DISKPART> select disk 1
      Disk 1 is now the selected disk.
      
      DISKPART> attributes disk clear readonly
      Disk attributes cleared successfully.
      
      DISKPART> attributes disk
      Current Read-only State : No
      Read-only : No
      Boot Disk : No
      Pagefile Disk : No
      Hibernation File Disk : No
      Crashdump Disk : No
      Clustered Disk : No
      
      DISKPART> online disk
      DiskPart successfully onlined the selected disk.
      
      DISKPART> san
      SAN Policy : Online All
      
  3. 解决问题后,请确保删除标志 DsRepair safebootOnce the issue is fixed, ensure that the flag DsRepair safeboot is removed:

    bcdedit /deletevalue {default} safeboot dsrepair

  4. 重新启动 VM。Restart your VM.

    备注

    如果 VM 刚刚从本地迁移,并且想要将更多域控制器从本地迁移到 Azure,则应考虑执行以下文章中的步骤,以防止在以后的迁移中发生此问题:If your VM was just migrated from on-premise and you want to migrate more domain controllers from on-premise to Azure, you should consider following the steps in the article below to prevent this issue from happening in future migrations:

    如何使用 Azure PowerShell 将现有本地 Hyper-V 域控制器上传到 AzureHow to upload existing on-premises Hyper-V domain controllers to Azure by using Azure PowerShell