Azure HDInsight 中的托管标识Managed identities in Azure HDInsight

托管标识是在 Azure Active Directory (Azure AD) 中注册的标识,其凭据由 Azure 管理。A managed identity is an identity registered in Azure Active Directory (Azure AD) whose credentials are managed by Azure. 使用托管标识,无需在 Azure AD 中注册服务主体,也无需维护证书等凭据。With managed identities, you don't need to register service principals in Azure AD, or maintain credentials such as certificates.

可以在 Azure HDInsight 中使用托管标识,根据需要访问 Azure AD 域服务或访问 Azure Data Lake Storage Gen2 中的文件。Managed identities are used in Azure HDInsight to access Azure AD domain services or access files in Azure Data Lake Storage Gen2 when needed.

有两种类型的托管标识:用户分配的托管标识和系统分配的托管标识。There are two types of managed identities: user-assigned and system-assigned. Azure HDInsight 仅支持用户分配的托管标识。Azure HDInsight supports only user-assigned managed identities. HDInsight 不支持系统分配的托管标识。HDInsight does not support system-assigned managed identities. 用户分配的托管标识创建为独立的 Azure 资源,可将其分配到一个或多个 Azure 服务实例。A user-assigned managed identity is created as a standalone Azure resource, which you can then assign to one or more Azure service instances. 相比之下,系统分配的托管标识是在 Azure AD 中创建的,系统会自动在特定的 Azure 服务实例上直接启用它。In contrast, a system-assigned managed identity is created in Azure AD and then enabled directly on a particular Azure service instance automatically. 然后,系统分配的该托管标识的生存期将绑定到启用该托管标识的服务实例的生存期。The life of that system-assigned managed identity is then tied to the life of the service instance that it's enabled on.

HDInsight 托管标识的实现HDInsight managed identity implementation

在 Azure HDInsight 中,托管标识仅适用于内部组件的 HDInsight 服务。In Azure HDInsight, managed identities are only usable by the HDInsight service for internal components. 目前没有任何支持的方法可用于通过 HDInsight 群集节点上安装的托管标识生成访问令牌来访问外部服务。There's currently no supported method to generate access tokens using the managed identities installed on HDInsight cluster nodes for accessing external services. 对于计算 VM 等某些 Azure 服务,托管标识是使用某个可用于获取访问令牌的终结点实现的。For some Azure services such as compute VMs, managed identities are implemented with an endpoint that you can use to acquire access tokens. 此终结点当前在 HDInsight 节点中不可用。This endpoint is currently not available in HDInsight nodes.

如果需要启动应用程序以避免将机密/密码放入分析作业(例如 SCALA 作业),可以使用脚本操作将自己的证书分发到群集节点,然后使用该证书获取访问令牌(例如,用于访问 Azure KeyVault 的令牌)。If you need to bootstrap your applications to avoid putting secrets/passwords in the analytics jobs (e.g. SCALA jobs), you can distribute your own certificates to the cluster nodes using script actions and then use that certificate to acquire an access token (for example to access Azure KeyVault).

创建托管标识Create a managed identity

可以通过以下任何方法创建托管标识:Managed identities can be created with any of the following methods:

托管标识的剩余配置步骤取决于使用该托管标识的方案。The remaining steps for configuring the managed identity depend on the scenario where it will be used.

Azure HDInsight 中的托管标识方案Managed identity scenarios in Azure HDInsight

Azure HDInsight 中的多种方案都会使用托管标识。Managed identities are used in Azure HDInsight in multiple scenarios. 有关详细的设置和配置说明,请参阅相关文档:See the related documents for detailed setup and configuration instructions:

HDInsight 将自动续订用于这些方案的托管标识的证书。HDInsight will automatically renew the certificates for the managed identities you use for these scenarios. 但是,当多个不同的托管标识用于长时间运行的群集时,可能会有一个限制,即对于所有托管标识,证书续订可能不会按预期方式允许。However, there is a limitation when multiple different managed identities are used for long running clusters, the certificate renewal may not work as expected for all of the managed identities. 由于此限制,如果你计划使用长时间运行的群集(例如,运行超过 60 天),我们建议对上述所有方案使用相同的托管标识。Due to this limitation, if you are planning to use long running clusters (e.g. more than 60 days), we recommend to use the same managed identity for all of the above scenarios.

如果你已经创建了具有多个不同托管标识的长时间运行的群集并且遇到以下问题之一:If you have already created a long running cluster with multiple different managed identities and are running into one of these issues:

  • 在 ESP 群集中,群集服务启动失败,或者纵向扩展和其他操作启动失败,并出现身份验证错误。In ESP clusters, cluster services starts failing or scale up and other operations start failing with authentications errors.
  • 在 ESP 群集中,更改 AAD-DS LDAPS 证书时,LDAPS 证书不会自动更新,因此 LDAP 同步和纵向扩展启动失败。In ESP clusters, when changing AAD-DS LDAPS cert, the LDAPS certificate does not automatically get updated and therefore LDAP sync and scale ups start failing.
  • 对 ADLS Gen2 的 MSI 访问启动失败。MSI access to ADLS Gen2 start failing.
  • 在 CMK 方案中无法轮换加密密钥。Encryption Keys can not be rotated in the CMK scenario. 则应将上述方案所需的角色和权限分配给群集中使用的所有托管标识。then you should assign the required roles and permissions for the above scenarios to all of those managed identities used in the cluster. 例如,如果你对 ADLS Gen2 和 ESP 群集使用了不同的托管标识,则它们都应分配有“存储 Blob 数据所有者”和“HDInsight 域服务参与者”角色,以避免遇到这些问题。For example, if you used different managed identities for ADLS Gen2 and ESP clusters then both of them should have the "Storage blob data Owner" and "HDInsight Domain Services Contributor" roles assigned to them to avoid running in to these issues.


如果在创建群集后删除托管标识,会发生什么情况?What happens if I delete the managed identity after the cluster creation?

需要托管标识时,群集会遇到问题。Your cluster will run into issues when the managed identity is needed. 创建群集后,当前没有办法更新或更改托管标识。There's currently no way to update or change a managed identity after the cluster is created. 建议确保在群集运行时不删除托管标识。So our recommendation is to make sure that the managed identity isn't deleted during the cluster runtime. 或者,可以重新创建群集并分配新的托管标识。Or you can re-create the cluster and assign a new managed identity.

后续步骤Next steps