方案:已启用磁盘加密的 Azure HDInsight 群集失去 Key Vault 访问权限Scenario: Azure HDInsight clusters with disk encryption lose Key Vault access

本文介绍在与 Azure HDInsight 群集交互时出现的问题的故障排除步骤和可能的解决方案。This article describes troubleshooting steps and possible resolutions for issues when interacting with Azure HDInsight clusters.

问题Issue

当“创建自己的密钥”(BYOK) 群集节点失去客户 Key Vault (KV) 的访问权限时,会针对该群集显示资源运行状况中心 (RHC) 警报 The HDInsight cluster is unable to access the key for BYOK encryption at restThe Resource Health Center (RHC) alert, The HDInsight cluster is unable to access the key for BYOK encryption at rest, is shown for Bring Your Own Key (BYOK) clusters where the cluster nodes have lost access to customers Key Vault (KV). Apache Ambari UI 中也会显示类似的警报。Similar alerts can also be seen on Apache Ambari UI.

原因Cause

此警报确保可从群集节点访问 KV,从而确保网络连接、KV 运行状况以及用户分配的托管标识的访问策略正常。The alert ensures that KV is accessible from the cluster nodes, thereby ensuring the network connection, KV health, and access policy for the user assigned Managed Identity. 此警报只是警告接下来在重新启动节点时中介即将关闭,但在重新启动节点之前,群集可继续正常运行。This alert is only a warning of impending broker shutdown on subsequent node reboots, the cluster continues to function until nodes reboot.

导航到 Apache Ambari UI,在“磁盘加密 Key Vault 状态”中找到有关警报的详细信息。 Navigate to Apache Ambari UI to find more information about the alert from Disk Encryption Key Vault Status. 此警报会详细说明验证失败的原因。This alert will have details about the reason for verification failure.

解决方法Resolution

KV/AAD 中断KV/AAD outage

查看 Azure Key Vault 可用性和冗余以及 Azure 状态页了解更多详细信息 https://status.azure.com/Look at Azure Key Vault availability and redundancy and Azure status page for more details https://status.azure.com/

KV 意外删除KV accidental deletion

  • 还原 KV 上已删除的密钥,以自动进行恢复。Restore deleted key on KV to auto recover. 有关详细信息,请参阅恢复已删除的密钥For more information, see Recover Deleted Key.
  • 联系 KV 团队,以便在意外删除后进行恢复。Reach out to KV team to recover from accidental deletions.

KV 访问策略已更改KV access policy changed

还原分配到 HDI 群集的、用户分配的托管标识的访问策略,以便能够访问 KV。Restore the access policies for the user assigned Managed Identity that is assigned to HDI cluster for accessing the KV.

密钥允许的操作Key permitted operations

对于 KV 中的每个密钥,可以选择一组允许的操作。For each key in KV, you can choose the set of permitted operations. 确保为 BYOK 密钥启用包装和解包操作Ensure that you have wrap and unwrap operations enabled for the BYOK key

过期的密钥Expired key

如果密钥已过期但尚未轮换密钥,请从备份 HSM 还原密钥,或者联系 KV 团队来清除过期日期。If the expiry has passed and key isn't rotated, restore key from backup HSM or contact KV team to clear the expiry date.

KV 防火墙阻止访问KV firewall blocking access

修复 KV 防火墙设置,以允许 BYOK 群集节点访问 KV。Fix the KV firewall settings to allow BYOK cluster nodes to access the KV.

虚拟网络中的 NSG 规则阻止访问NSG rules on virtual network blocking access

检查与附加到群集的虚拟网络相关联的 NSG 规则。Check the NSG rules associated with the virtual network attached to the cluster.

缓解和预防措施Mitigation and prevention steps

KV 意外删除KV accidental deletion

  • 配置 Key Vault 并设置资源锁Configure Key Vault with Resource Lock set.
  • 将密钥备份到其硬件安全模块。Back up keys to their Hardware Security Module.

密钥删除Key deletion

删除密钥之前应删除群集。Cluster should be deleted before key deletion.

KV 访问策略已更改KV access policy changed

定期审核并测试访问策略。Regularly audit and test access policies.

过期的密钥Expired key

  • 将密钥备份到 HSM。Back up keys to your HSM.
  • 使用未设置任何过期时间的密钥。Use a key without any expiry set.
  • 如果需要设置过期时间,请在过期日期之前轮换密钥。If expiry needs to be set, rotate the keys before the expiration date.

后续步骤Next steps

如果你的问题未在本文中列出,或者无法解决问题,请访问以下渠道以获取更多支持:If you didn't see your problem or are unable to solve your issue, visit the following channel for more support:

  • 如果需要更多帮助,可以从 Azure 门户提交支持请求。If you need more help, you can submit a support request from the Azure portal.