在 Azure HDInsight 中创建群集失败并出现 InvalidNetworkConfigurationErrorCodeCluster creation fails with InvalidNetworkConfigurationErrorCode in Azure HDInsight

本文介绍在与 Azure HDInsight 群集交互时出现的问题的故障排除步骤和可能的解决方法。This article describes troubleshooting steps and possible resolutions for issues when interacting with Azure HDInsight clusters.

如果看到错误代码 InvalidNetworkConfigurationErrorCode 和说明“虚拟网络配置与 HDInsight 要求不兼容”,这往往表示群集的虚拟网络配置有问题。If you see error code InvalidNetworkConfigurationErrorCode with the description "Virtual Network configuration is not compatible with HDInsight Requirement", it usually indicates a problem with the virtual network configuration for your cluster. 请根据错误说明中的余下内容,按照以下部分所述解决问题。Based on the rest of the error description, follow the below sections to resolve your problem.

“主机名解析失败”"HostName Resolution failed"

问题Issue

错误说明中包含“主机名解析失败”。Error description contains "HostName Resolution failed".

原因Cause

此错误与某个自定义 DNS 配置问题相关。This error points to a problem with custom DNS configuration. 虚拟网络中的 DNS 服务器可以将 DNS 查询转发到 Azure 的递归解析程序,以便解析该虚拟网络中的主机名(有关详细信息,请参阅虚拟网络中的名称解析)。DNS servers within a virtual network can forward DNS queries to Azure's recursive resolvers to resolve hostnames within that virtual network (see Name Resolution in Virtual Networks for details). 可以通过虚拟 IP 168.63.129.16 访问 Azure 的递归解析程序。Access to Azure's recursive resolvers is provided via the virtual IP 168.63.129.16. 只能从 Azure VM 访问此 IP。This IP is only accessible from the Azure VMs. 因此,如果使用本地 DNS 服务器,或者 DNS 服务器是不属于群集 vNet 的 Azure VM,则此 IP 不起作用。So it will not work if you are using an OnPrem DNS server, or your DNS server is an Azure VM, which is not part of the cluster's vNet.

解决方法Resolution

  1. 通过 SSH 连接到属于群集的 VM,并运行命令 hostname -fSsh into the VM that is part of the cluster, and run the command hostname -f. 此命令将返回主机的完全限定域名(在以下说明中称为 <host_fqdn>)。This will return the host’s fully qualified domain name (referred to as <host_fqdn> in the below instructions).

  2. 然后运行命令 nslookup <host_fqdn>(例如 nslookup hn1-hditest.5h6lujo4xvoe1kprq3azvzmwsd.hx.internal.chinacloudapp.cn)。Then, run the command nslookup <host_fqdn> (for example, nslookup hn1-hditest.5h6lujo4xvoe1kprq3azvzmwsd.hx.internal.chinacloudapp.cn). 如果此命令将名称解析为 IP 地址,则表示 DNS 服务器工作正常。If this command resolves the name to an IP address, it means your DNS server is working correctly. 在这种情况下,请提交有关 HDInsight 的支持案例,我们将调查你的问题。In this case, raise a support case with HDInsight, and we will investigate your issue. 请在支持案例中包含执行的故障排除步骤。In your support case, include the troubleshooting steps you executed. 这有助于我们更快解决问题。This will help us resolve the issue faster.

  3. 如果以上命令未返回 IP 地址,请运行 nslookup <host_fqdn> 168.63.129.16(例如 nslookup hn1-hditest.5h6lujo4xvoe1kprq3azvzmwsd.hx.internal.chinacloudapp.cn 168.63.129.16)。If the above command does not return an IP address, then run nslookup <host_fqdn> 168.63.129.16 (for example, nslookup hn1-hditest.5h6lujo4xvoe1kprq3azvzmwsd.hx.internal.chinacloudapp.cn 168.63.129.16). 如果此命令能够解析 IP,则表示 DNS 服务器未将查询转发到 Azure 的 DNS,或者它不是与群集处于同一 vNet 中的 VM。If this command is able to resolve the IP, it means that either your DNS server is not forwarding the query to Azure's DNS, or it is not a VM that is part of the same vNet as the cluster.

  4. 如果你没有任何可充当群集 vNet 中的自定义 DNS 服务器的 Azure VM,则需要先添加此 VM。If you do not have an Azure VM that can act as a custom DNS server in the cluster’s vNet, then you need to add this first. 在 vNet 中创建一个要配置为 DNS 转发器的 VM。Create a VM in the vNet, which will be configured as DNS forwarder.

  5. 在 vNet 中部署 VM 后,在此 VM 上配置 DNS 转发规则。Once you have a VM deployed in your vNet, configure the DNS forwarding rules on this VM. 将所有 iDNS 名称解析请求转发到 168.63.129.16,将剩余的请求转发到 DNS 服务器。Forward all iDNS name resolution requests to 168.63.129.16, and the rest to your DNS server. 此处提供了一个示例来演示如何为自定义 DNS 服务器完成此设置。Here is an example of this setup for a custom DNS server.

  6. 添加此 VM 的 IP 地址作为虚拟网络 DNS 配置的第一个 DNS 条目。Add the IP Address of this VM as first DNS entry for the Virtual Network DNS configuration.


“无法连接到 Azure 存储帐户”"Failed to connect to Azure Storage Account”

问题Issue

错误说明中包含“无法连接到 Azure 存储帐户”或“无法连接到 Azure SQL”。Error description contains "Failed to connect to Azure Storage Account” or “Failed to connect to Azure SQL".

原因Cause

Azure 存储和 SQL 没有固定的 IP 地址,因此,我们需要允许与所有 IP 建立出站连接,以允许访问这些服务。Azure Storage and SQL do not have fixed IP Addresses, so we need to allow outbound connections to all IPs to allow accessing these services. 确切的解决步骤取决于设置的是网络安全组 (NSG) 还是用户定义的规则 (UDR)。The exact resolution steps depend on whether you have set up a Network Security Group (NSG) or User-Defined Rules (UDR). 有关这些配置的详细信息,请参阅有关使用网络安全组和用户定义的路由控制 HDInsight 网络流量的部分。Refer to the section on controlling network traffic with HDInsight with network security groups and user-defined routes for details on these configurations.

解决方法Resolution

  • 如果群集使用网络安全组 (NSG)If your cluster uses a Network Security Group (NSG).

    转到 Azure 门户,并找到与其中部署了群集的子网关联的 NSG。Go to the Azure portal and identify the NSG that is associated with the subnet where the cluster is being deployed. 在“出站安全规则”部分,允许不受限制地对 Internet 进行出站访问(请注意,此处的优先级编号越小,表示优先级越高)。 In the Outbound security rules section, allow outbound access to internet without limitation (note that a smaller priority number here means higher priority). 另外,在“子网”部分确认此 NSG 是否已应用到群集子网。 Also, in the subnets section, confirm if this NSG is applied to the cluster subnet.

  • 如果群集使用用户定义的路由 (UDR)If your cluster uses a User-defined Routes (UDR).

    转到 Azure 门户,并找到与其中部署了群集的子网关联的路由表。Go to the Azure portal and identify the route table that is associated with the subnet where the cluster is being deployed. 找到子网的路由表后,检查其中的 routes 节。Once you find the route table for the subnet, inspect the routes section in it.

    如果定义了路由,请确保部署了群集的区域的 IP 地址存在路由,并且每个路由的 NextHopTypeInternetIf there are routes defined, make sure that there are routes for IP addresses for the region where the cluster was deployed, and the NextHopType for each route is Internet. 应该为上述文章中所述的每个所需 IP 地址定义一个路由。There should be a route defined for each required IP Address documented in the aforementioned article.


后续步骤Next steps

如果你的问题未在本文中列出,或者无法解决问题,请访问以下渠道获取更多支持:If you didn't see your problem or are unable to solve your issue, visit the following channel for more support:

  • 如果需要更多帮助,可以从 Azure 门户提交支持请求。If you need more help, you can submit a support request from the Azure portal. 从菜单栏中选择“支持” ,或打开“帮助 + 支持” 中心。Select Support from the menu bar or open the Help + support hub.