为 Azure Active Directory 与企业安全性套餐的集成配置 HDInsight 群集Configure HDInsight clusters for Azure Active Directory integration with Enterprise Security Package

本文总结并概述了与 Azure Active Directory 集成的 HDInsight 群集的创建和配置过程。This article provides a summary and overview of the process of creating and configuring an HDInsight cluster integrated with Azure Active Directory. 此集成依赖于称为企业安全性套餐 (ESP) 的 HDInsight 功能、Azure Active Directory 域服务 (Azure AD-DS) 和现有的本地 Active Directory。This integration relies on a HDInsight feature called Enterprise Security Package (ESP), Azure Active Directory Domain Services (Azure AD-DS) and your pre-existing on-premises Active Directory.

有关在 Azure 中设置和配置域并创建已启用 ESP 的群集然后同步本地用户的详细分步教程,请参阅在 Azure HDInsight 中创建和配置企业安全性套餐群集For a detailed, step-by-step tutorial on setting up and configuring a domain in Azure and creating an ESP enabled cluster and then syncing on-premises users, see Create and configure Enterprise Security Package clusters in Azure HDInsight.


企业安全性套餐 (ESP) 为 Azure HDInsight 提供了 Active Directory 集成。Enterprise Security Package (ESP) provides Active Directory integration for Azure HDInsight. 此集成使域用户能够使用其域凭据向 HDInsight 群集进行身份验证并运行大数据作业。This integration allows domain users to use their domain credentials to authenticate with HDInsight clusters and run big data jobs.


ESP 已在 HDInsight 3.6 和 4.0 中正式发布,适用于以下群集类型:Apache Spark、Interactive、Hadoop 和 HBase。ESP is generally available in HDInsight 3.6 and 4.0 for these cluster types: Apache Spark, Interactive, Hadoop, and HBase. 适用于 Apache Kafka 群集类型的 ESP 为预览版,我们尽最大努力提供支持。ESP for the Apache Kafka cluster type is in preview with best-effort support only. 在 ESP 正式发布日期(2018 年 10 月 1 日)之前创建的 ESP 群集不受支持。ESP clusters created before the ESP GA date (October 1, 2018) are not supported.


在创建已启用 ESP 的 HDInsight 群集之前,需要满足以下几个先决条件:There are a few prerequisites to complete before you can create an ESP-enabled HDInsight cluster:

  • 现有的本地 Active Directory 和 Azure Active Directory。An existing on-premises Active Directory and Azure Active Directory.
  • 启用 Azure AD-DS。Enable Azure AD-DS.
  • 检查 Azure AD-DS 运行状况以确保同步已完成。Check Azure AD-DS health status to ensure synchronization completed.
  • 创建托管标识并为其授权。Create and authorize a managed identity.
  • 针对 DNS 和相关问题完成网络设置。Complete Networking setup for DNS and related issues.

下面将详细讨论其中的每一项。Each of these items will be discussed in detail below. 有关完成所有这些步骤的演练,请参阅在 Azure HDInsight 中创建和配置企业安全性套餐群集For a walkthrough of completing all of these steps, see Create and configure Enterprise Security Package clusters in Azure HDInsight.

启用 Azure AD DSEnable Azure AD DS

要想能够创建带有 ESP 的 HDInsight 群集,必须先启用 Azure AD DS。Enabling Azure AD DS is a prerequisite before you can create an HDInsight cluster with ESP. 有关详细信息,请参阅使用 Azure 门户启用 Azure Active Directory 域服务For more information, see Enable Azure Active Directory Domain Services by using the Azure portal.

默认情况下,启用 Azure AD DS 后,所有用户和对象将开始从 Azure Active Directory (Azure AD) 同步到 Azure AD DS。When Azure AD DS is enabled, all users and objects start synchronizing from Azure Active Directory (Azure AD) to Azure AD DS by default. 同步操作的时长取决于 Azure AD 中对象的数目。The length of the sync operation depends on the number of objects in Azure AD. 如果对象以十万记,则同步可能需要数天。The sync might take a few days for hundreds of thousands of objects.

与 Azure AD DS 配合使用的域名必须不超过 39 个字符才能与 HDInsight 配合使用。The domain name that you use with Azure AD DS must be 39 characters or fewer, to work with HDInsight.

你可以选择仅同步需要访问 HDInsight 群集的组。You can choose to sync only the groups that need access to the HDInsight clusters. 这种仅同步特定组的选项称为“范围有限的同步”。This option of syncing only certain groups is called scoped synchronization. 有关说明,请参阅配置从 Azure AD 到托管域的范围有限的同步For instructions, see Configure scoped synchronization from Azure AD to your managed domain.

启用安全 LDAP 时,请将域名置于使用者名称中,When you're enabling secure LDAP, put the domain name in the subject name. 并将使用者可选名称置于证书中。And the subject alternative name in the certificate. 如果域名为 contoso100.onmicrosoft.com,请确保证书使用者名称和使用者可选名称中存在完全匹配的名称。If your domain name is contoso100.onmicrosoft.com, ensure the exact name exists in your certificate subject name and subject alternative name. 有关详细信息,请参阅为 Azure AD DS 托管域配置安全 LDAPFor more information, see Configure secure LDAP for an Azure AD DS managed domain.

以下示例创建一个自签名证书。The following example creates a self-signed certificate. 域名 contoso100.onmicrosoft.com 存在于 Subject(使用者名称)和 DnsName(使用者可选名称)中。The domain name contoso100.onmicrosoft.com is in both Subject (subject name) and DnsName (subject alternative name).

New-SelfSignedCertificate -Subject contoso100.onmicrosoft.com `
  -NotAfter $lifetime.AddDays(365) -KeyUsage DigitalSignature, KeyEncipherment `
  -Type SSLServerAuthentication -DnsName *.contoso100.onmicrosoft.com, contoso100.onmicrosoft.com


只有租户管理员有权启用 Azure AD DS。Only tenant administrators have the privileges to enable Azure AD DS. 如果群集存储是 Azure Data Lake Storage Gen1 或 Gen2,则必须只对需要使用基本 Kerberos 身份验证访问群集的用户禁用 Azure AD 多重身份验证。If the cluster storage is Azure Data Lake Storage Gen1 or Gen2, you must disable Azure AD Multi-Factor Authentication only for users who will need to access the cluster by using basic Kerberos authentication.

可以使用受信任 IP条件访问仅在特定用户访问 HDInsight 群集的虚拟网络 IP 范围时对其禁用多重身份验证。You can use trusted IPs or Conditional Access to disable Multi-Factor Authentication for specific users only when they're accessing the IP range for the HDInsight cluster's virtual network. 如果使用条件访问,请确保在 HDInsight 虚拟网络上启用了 Active Directory 服务终结点。If you're using Conditional Access, make sure that the Active Directory service endpoint in enabled on the HDInsight virtual network.

如果群集存储是 Azure Blob 存储,请不要禁用多重身份验证。If the cluster storage is Azure Blob storage, do not disable Multi-Factor Authentication.

检查 Azure AD DS 运行状况Check Azure AD DS health status

在“管理”类别中选择“运行状况”,查看 Azure Active Directory 域服务的运行状况。 View the health status of Azure Active Directory Domain Services by selecting Health in the Manage category. 确保 Azure AD DS 的状态为绿色(正在运行),且同步已完成。Make sure the status of Azure AD DS is green (running) and the synchronization is complete.

Azure AD DS 运行状况

创建托管标识并为其授权Create and authorize a managed identity

使用用户分配的托管标识来简化安全的域服务操作。Use a user-assigned managed identity to simplify secure domain services operations. 为托管标识分配 HDInsight 域服务参与者 角色后,它就可以读取、创建、修改和删除域服务操作。When you assign the HDInsight Domain Services Contributor role to the managed identity, it can read, create, modify, and delete domain services operations.

对于 HDInsight 企业安全性套餐,某些域服务操作是必需的,例如创建 OU 和服务主体。Certain domain services operations, such as creating OUs and service principals, are needed for HDInsight Enterprise Security Package. 可以在任何订阅中创建托管标识。You can create managed identities in any subscription. 有关托管标识的常规详细信息,请参阅 Azure 资源的托管标识For more information on managed identities in general, see Managed identities for Azure resources. 有关 Azure HDInsight 中托管标识的工作原理的详细信息,请参阅 Azure HDInsight 中的托管标识For more information on how managed identities work in Azure HDInsight, see Managed identities in Azure HDInsight.

若要设置 ESP 群集,请创建用户分配的托管标识(如果还没有)。To set up ESP clusters, create a user-assigned managed identity if you don't have one already. 请参阅 Create, list, delete, or assign a role to a user-assigned managed identity by using the Azure portalSee Create, list, delete, or assign a role to a user-assigned managed identity by using the Azure portal.

接下来,在 Azure AD DS 访问控制 中为托管标识分配“HDInsight 域服务参与者”角色。Next, assign the HDInsight Domain Services Contributor role to the managed identity in Access control for Azure AD DS. 你需要具有 Azure AD DS 管理员权限才能进行此角色分配。You need Azure AD DS admin privileges to make this role assignment.

Azure Active Directory 域服务访问控制

分配“HDInsight 域服务参与者”角色可确保此标识有适当的 (on behalf of) 访问权限,可以在 Azure AD DS 域上执行域服务操作。Assigning the HDInsight Domain Services Contributor role ensures that this identity has proper (on behalf of) access to do domain services operations on the Azure AD DS domain. 这些操作包括创建和删除 OU。These operations include creating and deleting OUs.

为托管标识指定角色后,Azure AD DS 管理员可以对谁使用它进行管理。After the managed identity is given the role, the Azure AD DS admin manages who uses it. 首先,管理员在门户中选择该托管标识。First, the admin selects the managed identity in the portal. 然后在“概览”下选择“访问控制(标识和访问管理)”。Then selects Access Control (IAM) under Overview. 管理员为需要创建 ESP 群集的用户或组分配“托管标识操作员”角色。The admin assigns the Managed Identity Operator role to users or groups that want to create ESP clusters.

例如,Azure AD DS 管理员可以将此角色分配给“sjmsi”托管标识的“MarketingTeam”组。 For example, the Azure AD DS admin can assign this role to the MarketingTeam group for the sjmsi managed identity. 下图中显示了一个示例。An example is shown in the following image. 此分配确保组织中的适当人员可以使用托管标识来创建 ESP 群集。This assignment ensures the right people in the organization can use the managed identity to create ESP clusters.

HDInsight 托管标识操作者角色分配

网络配置Network configuration


Azure AD DS 必须部署在基于 Azure 资源管理器的虚拟网络中。Azure AD DS must be deployed in an Azure Resource Manager-based virtual network. Azure AD DS 不支持经典虚拟网络。Classic virtual networks are not supported for Azure AD DS. 有关详细信息,请参阅使用 Azure 门户启用 Azure Active Directory 域服务For more information, see Enable Azure Active Directory Domain Services by using the Azure portal.

启用 Azure AD DS。Enable Azure AD DS. 然后,本地域名系统 (DNS) 服务器会在 Active Directory 虚拟机 (VM) 上运行。Then a local Domain Name System (DNS) server runs on the Active Directory virtual machines (VMs). 配置 Azure AD DS 虚拟网络来使用这些自定义 DNS 服务器。Configure your Azure AD DS virtual network to use these custom DNS servers. 若要找到正确的 IP 地址,请在“管理”类别中选择“属性”,然后在“虚拟网络上的 IP 地址”下查看。To locate the right IP addresses, select Properties in the Manage category and look under IP ADDRESS ON VIRTUAL NETWORK.

找到本地 DNS 服务器的 IP 地址

更改 Azure AD DS 虚拟网络中的 DNS 服务器的配置。Change the configuration of the DNS servers in the Azure AD DS virtual network. 若要使用这些自定义 IP,请在“设置”类别中选择“DNS 服务器”。To use these custom IPs, select DNS servers in the Settings category. 然后选择“自定义”选项,在文本框中输入第一个 IP 地址,然后选择“保存”。 Then select the Custom option, enter the first IP address in the text box, and select Save. 使用相同步骤添加更多的 IP 地址。Add more IP addresses by using the same steps.

更新虚拟网络 DNS 配置

将 Azure AD DS 实例和 HDInsight 群集放在同一 Azure 虚拟网络中会更方便。It's easier to place both the Azure AD DS instance and the HDInsight cluster in the same Azure virtual network. 如果打算使用不同的虚拟网络,必须将这些虚拟网络对等互连,以便 HDInsight VM 可以看见域控制器。If you plan to use different virtual networks, you must peer those virtual networks so that the domain controller is visible to HDInsight VMs. 有关详细信息,请参阅虚拟网络对等互连For more information, see Virtual network peering.

将虚拟网络对等互连后,将 HDInsight 虚拟网络配置为使用自定义 DNS 服务器。After the virtual networks are peered, configure the HDInsight virtual network to use a custom DNS server. 输入 Azure AD DS 专用 IP 作为 DNS 服务器地址。And enter the Azure AD DS private IPs as the DNS server addresses. 当两个虚拟网络都使用相同的 DNS 服务器时,自定义域名将解析为正确的 IP 并可从 HDInsight 访问该域名。When both virtual networks use the same DNS servers, your custom domain name will resolve to the right IP and will be reachable from HDInsight. 例如,如果域名为 contoso.com,则在此步骤后,ping contoso.com 应能解析为正确的 Azure AD DS IP。For example, if your domain name is contoso.com, then after this step, ping contoso.com should resolve to the right Azure AD DS IP.

为对等互连的虚拟网络配置自定义 DNS 服务器

如果在 HDInsight 子网中使用网络安全组 (NSG) 规则,应允许入站和出站流量所需的 IPIf you're using network security group (NSG) rules in your HDInsight subnet, you should allow the required IPs for both inbound and outbound traffic.

若要测试网络连接设置,请将 Windows VM 加入 HDInsight 虚拟网络/子网并对域名执行 ping 命令。To test your network setup, join a Windows VM to the HDInsight virtual network/subnet and ping the domain name. (它应当能够解析为 IP。)运行 ldp.exe 来访问 Azure AD DS 域。(It should resolve to an IP.) Run ldp.exe to access the Azure AD DS domain. 然后将此 Windows VM 加入域,以确认客户端和服务器之间所有必需的 RPC 调用均已成功。Then join this Windows VM to the domain to confirm that all the required RPC calls succeed between the client and server.

使用 nslookup 确认是否可以通过网络访问你的存储帐户,Use nslookup to confirm network access to your storage account. 以及是否可以访问你可能使用的任何外部数据库(例如,外部 Hive 元存储或 Ranger DB)。Or any external database that you might use (for example, external Hive metastore or Ranger DB). 如果 NSG 保护 Azure AD DS,请确保在 Azure AD DS 子网的 NSG 规则中允许必需的端口Ensure the required ports are allowed in the Azure AD DS subnet's NSG rules, if an NSG secures Azure AD DS. 如果此 Windows VM 的域加入操作成功,则可继续执行下一步以创建 ESP 群集。If the domain joining of this Windows VM is successful, then you can continue to the next step and create ESP clusters.

创建带有 ESP 的 HDInsight 群集Create an HDInsight cluster with ESP

正确设置前面的步骤后,下一步是创建启用了 ESP 的 HDInsight 群集。After you've set up the previous steps correctly, the next step is to create the HDInsight cluster with ESP enabled. 创建 HDInsight 群集后,可以在“安全性 + 网络”选项卡上启用企业安全性套餐。对于用于部署的 Azure 资源管理器模板,请使用一次门户体验。When you create an HDInsight cluster, you can enable Enterprise Security Package on the Security + networking tab. For an Azure Resource Manager template for deployment, use the portal experience once. 然后,从“审阅 + 创建”页面下载预填充的模板供将来重复使用。Then download the prefilled template on the Review + create page for future reuse.

也可在创建群集期间启用 HDInsight ID 代理功能。You can also enable the HDInsight ID Broker feature during cluster creation. 可以使用 ID 代理功能通过多重身份验证登录到 Ambari,并获取所需的 Kerberos 票证,无需使用 Azure AD DS 中的密码哈希。The ID Broker feature lets you sign in to Ambari by using Multi-Factor Authentication and get the required Kerberos tickets without needing password hashes in Azure AD DS.


ESP 群集名称的前六个字符在环境中必须是唯一的。The first six characters of the ESP cluster names must be unique in your environment. 例如,如果你在不同虚拟网络中有多个 ESP 群集,请选择一个命名约定,该命名约定确保群集名称中的前六个字符是唯一的。For example, if you have multiple ESP clusters in different virtual networks, choose a naming convention that ensures the first six characters on the cluster names are unique.

Azure HDInsight 企业安全性套餐的域验证

启用 ESP 后,会自动检测与 Azure AD DS 相关的常见错误配置并对其进行验证。After you enable ESP, common misconfigurations related to Azure AD DS are automatically detected and validated. 纠正这些错误后,可以继续执行下一步。After you fix these errors, you can continue with the next step.

Azure HDInsight 企业安全性套餐域验证失败

创建带有 ESP 的 HDInsight 群集时,必须提供以下参数:When you create an HDInsight cluster with ESP, you must supply the following parameters:

  • 群集管理员用户:从同步的 Azure AD DS 实例中为你的群集选择管理员。Cluster admin user: Choose an admin for your cluster from your synced Azure AD DS instance. 此域帐户必须已同步并在 Azure AD DS 中可用。This domain account must be already synced and available in Azure AD DS.

  • 群集访问组:你要同步其用户且其用户有权访问群集的安全组应该在 Azure AD DS 中可用。Cluster access groups: The security groups whose users you want to sync and have access to the cluster should be available in Azure AD DS. 例如,HiveUsers 组。An example is the HiveUsers group. 有关详细信息,请参阅在 Azure Active Directory 中创建组并添加成员For more information, see Create a group and add members in Azure Active Directory.

  • LDAPS URL:例如 ldaps://contoso.com:636LDAPS URL: An example is ldaps://contoso.com:636.

创建新群集时,可以从 用户分配的托管标识 下拉列表中选择已创建的托管标识。The managed identity that you created can be chosen from the User-assigned managed identity drop-down list when you're creating a new cluster.

Azure HDInsight ESP Active Directory 域服务托管标识..

后续步骤Next steps