使用防火墙配置 Azure HDInsight 群集的出站网络流量Configure outbound network traffic for Azure HDInsight clusters using Firewall

本文提供使用 Azure 防火墙保护来自 HDInsight 群集的出站流量的步骤。This article provides the steps for you to secure outbound traffic from your HDInsight cluster using Azure Firewall. 以下步骤假设你要为现有群集配置 Azure 防火墙。The steps below assume you're configuring an Azure Firewall for an existing cluster. 如果要在防火墙后部署新群集,请先创建 HDInsight 群集和子网。If you're deploying a new cluster behind a firewall, create your HDInsight cluster and subnet first. 然后按照指南中的步骤操作。Then follow the steps in this guide.

背景Background

HDInsight 群集通常部署在虚拟网络中。HDInsight clusters are normally deployed in a virtual network. 群集与该虚拟网络外部的服务具有依赖关系。The cluster has dependencies on services outside of that virtual network.

无法通过防火墙发送入站管理流量。The inbound management traffic can't be sent through a firewall. 此处所述,可以将 NSG 服务标记用于入站流量。You can use NSG service tags for the inbound traffic as documented here.

HDInsight 出站流量依赖项几乎完全都是使用 FQDN 进行定义的。The HDInsight outbound traffic dependencies are almost entirely defined with FQDNs. 它们后面没有静态 IP 地址。Which don't have static IP addresses behind them. 缺少静态地址意味着网络安全组 (NSG) 无法锁定来自群集的出站流量。The lack of static addresses means Network Security Groups (NSGs) can't lock down outbound traffic from a cluster. IP 地址更改太频繁,用户无法基于当前名称解析设置规则并使用规则。The IP addresses change often enough one can't set up rules based on the current name resolution and use.

使用可以基于 FQDN 控制出站流量的防火墙保护出站地址。Secure outbound addresses with a firewall that can control outbound traffic based on FQDNs. Azure 防火墙可以根据目标的 FQDN 或 FQDN 标记限制出站流量。Azure Firewall restricts outbound traffic based on the FQDN of the destination or FQDN tags.

在 HDInsight 中配置 Azure 防火墙Configuring Azure Firewall with HDInsight

使用 Azure 防火墙锁定现有 HDInsight 的传出流量的步骤摘要如下:A summary of the steps to lock down egress from your existing HDInsight with Azure Firewall are:

  1. 创建子网。Create a subnet.
  2. 创建防火墙。Create a firewall.
  3. 将应用程序规则添加到防火墙Add application rules to the firewall
  4. 将网络规则添加到防火墙。Add network rules to the firewall.
  5. 创建一个路由表。Create a routing table.

创建新子网Create new subnet

在群集所在的虚拟网络中创建名为 AzureFirewallSubnet 的子网。Create a subnet named AzureFirewallSubnet in the virtual network where your cluster exists.

为群集创建新的防火墙Create a new firewall for your cluster

遵循以下文章中“部署防火墙”部分所述的步骤创建名为 Test-FW01 的防火墙:教程:使用 Azure 门户部署和配置 Azure 防火墙Create a firewall named Test-FW01 using the steps in Deploy the firewall from Tutorial: Deploy and configure Azure Firewall using the Azure portal.

使用应用程序规则配置防火墙Configure the firewall with application rules

创建一个应用程序规则集合,以允许群集发送和接收重要通信。Create an application rule collection that allows the cluster to send and receive important communications.

  1. 在 Azure 门户中选择新防火墙 Test-FW01Select the new firewall Test-FW01 from the Azure portal.

  2. 导航到“设置” > “规则” > “应用程序规则集合” > “+ 添加应用程序规则集合”。 Navigate to Settings > Rules > Application rule collection > + Add application rule collection.

    标题:添加应用程序规则集合

  3. 在“添加应用程序规则集合”屏幕上提供以下信息:On the Add application rule collection screen, provide the following information:

    顶部部分Top section

    属性Property ValueValue
    名称Name FwAppRuleFwAppRule
    优先级Priority 200200
    操作Action AllowAllow

    FQDN 标记部分FQDN tags section

    名称Name 源地址Source address FQDN 标记FQDN tag 注释Notes
    Rule_1Rule_1 * WindowsUpdate 和 HDInsightWindowsUpdate and HDInsight HDI 服务所需Required for HDI services

    目标 FQDN 部分Target FQDNs section

    名称Name 源地址Source addresses 协议:端口Protocol:Port 目标 FQDNTarget FQDNS 注释Notes
    Rule_2Rule_2 * https:443https:443 login.chinacloudapi.cnlogin.chinacloudapi.cn 允许 Windows 登录活动Allows Windows login activity
    Rule_3Rule_3 * https:443https:443 login.microsoftonline.cnlogin.microsoftonline.cn 允许 Windows 登录活动Allows Windows login activity
    Rule_4Rule_4 * https:443,http:80https:443,http:80 storage_account_name.blob.core.chinacloudapi.cnstorage_account_name.blob.core.chinacloudapi.cn 请将 storage_account_name 替换为实际的存储帐户名称。Replace storage_account_name with your actual storage account name. 要仅使用 https 连接,请确保在存储帐户上启用了“需要安全传输”To use ONLY https connections, make sure "secure transfer required" is enabled on the storage account. 如果使用专用终结点来访问存储帐户,则不需要此步骤,并且存储流量不会转发到防火墙。If you are using Private endpoint to access storage accounts, this step is not needed and storage traffic is not forwarded to the firewall.

    标题:输入应用程序规则集合详细信息

  4. 选择“添加” 。Select Add.

使用网络规则配置防火墙Configure the firewall with network rules

创建网络规则以正确配置 HDInsight 群集。Create the network rules to correctly configure your HDInsight cluster.

  1. 完成上一步骤后,导航到“网络规则集合” > “+ 添加网络规则集合”。 Continuing from the prior step, navigate to Network rule collection > + Add network rule collection.

  2. 在“添加网络规则集合”屏幕上提供以下信息:On the Add network rule collection screen, provide the following information:

    顶部部分Top section

    属性Property ValueValue
    名称Name FwNetRuleFwNetRule
    优先级Priority 200200
    操作Action AllowAllow

    服务标记部分Service Tags section

    名称Name 协议Protocol 源地址Source Addresses 服务标记Service Tags 目标端口Destination Ports 注释Notes
    Rule_5Rule_5 TCPTCP * SQLSQL 14331433 如果使用的是 HDInsight 提供的默认 SQL 服务,请在“服务标记”部分为 SQL 配置网络规则,以便记录和审核 SQL 通信。If you are using the default sql servers provided by HDInsight, configure a network rule in the Service Tags section for SQL that will allow you to log and audit SQL traffic. 除非在 HDInsight 子网中为 SQL Server 配置了服务终结点,否则它将绕过防火墙。Unless you configured Service Endpoints for SQL Server on the HDInsight subnet, which will bypass the firewall. 如果对 Ambari、Oozie、Ranger 和 Hive metastroes 使用自定义 SQL Server,则只需允许流量发送到自己的自定义 SQL Server。If you are using custom SQL server for Ambari, Oozie, Ranger and Hive metastroes then you only need to allow the traffic to your own custom SQL Servers.
    Rule_6Rule_6 TCPTCP * Azure MonitorAzure Monitor * (可选)计划使用自动缩放功能的客户应添加此规则。(optional) Customers who plan to use auto scale feature should add this rule.

    标题:输入应用程序规则集合

  3. 选择“添加” 。Select Add.

创建并配置路由表Create and configure a route table

创建包含以下条目的路由表:Create a route table with the following entries:

  • 来自运行状况和管理服务的下一跃点类型为 Internet 的所有 IP 地址。All IP addresses from Health and management services with a next hop type of Internet. 它应包括 4 个通用区域的 IP 以及 2 个特定区域的 IP。It should include 4 IPs of the generic regions as well as 2 IPs for your specific region. 仅当 ResourceProviderConnection 设置为“入站”时,才需要此规则。This rule is only needed if the ResourceProviderConnection is set to Inbound. 如果 ResourceProviderConnection 设置为“出站”,则 UDR 中不需要这些 IP。If the ResourceProviderConnection is set to Outbound then these IPs are not needed in the UDR.

  • IP 地址 0.0.0.0/0 的一个虚拟设备路由,其下一跃点为 Azure 防火墙专用 IP 地址。One Virtual Appliance route for IP address 0.0.0.0/0 with the next hop being your Azure Firewall private IP address.

例如,若要为“中国东部”区域创建的群集配置路由表,请使用以下步骤:For example, to configure the route table for a cluster created in the US region of "China East", use following steps:

  1. 选择 Azure 防火墙 Test-FW01Select your Azure firewall Test-FW01. 复制“概述”页上列出的“专用 IP 地址”。 Copy the Private IP address listed on the Overview page. 本示例使用 示例地址 10.0.2.4For this example, we'll use a sample address of 10.0.2.4.

  2. 然后导航到“所有服务” > “网络” > “路由表”和“创建路由表”。 Then navigate to All services > Networking > Route tables and Create Route Table.

  3. 在新路由中,导航到“设置” > “路由” > “+ 添加”。 From your new route, navigate to Settings > Routes > + Add. 添加以下路由:Add the following routes:

路由名称Route name 地址前缀Address prefix 下一跃点类型Next hop type 下一跃点地址Next hop address
168.61.49.99168.61.49.99 168.61.49.99/32168.61.49.99/32 InternetInternet 不可用NA
23.99.5.23923.99.5.239 23.99.5.239/3223.99.5.239/32 InternetInternet 不可用NA
168.61.48.131168.61.48.131 168.61.48.131/32168.61.48.131/32 InternetInternet 不可用NA
138.91.141.162138.91.141.162 138.91.141.162/32138.91.141.162/32 InternetInternet 不可用NA
13.82.225.23313.82.225.233 13.82.225.233/3213.82.225.233/32 InternetInternet 不可用NA
40.71.175.9940.71.175.99 40.71.175.99/3240.71.175.99/32 InternetInternet 不可用NA
0.0.0.00.0.0.0 0.0.0.0/00.0.0.0/0 虚拟设备Virtual appliance 10.0.2.410.0.2.4

完成路由表配置:Complete the route table configuration:

  1. 选择“设置”下的“子网”,将创建的路由表分配到 HDInsight 子网。 Assign the route table you created to your HDInsight subnet by selecting Subnets under Settings.

  2. 选择“+ 关联”。Select + Associate.

  3. 在“关联子网”屏幕上,选择群集创建到的虚拟网络。On the Associate subnet screen, select the virtual network that your cluster was created into. 以及用于 HDInsight 群集的“子网”。And the Subnet you used for your HDInsight cluster.

  4. 选择“确定”。Select OK.

边缘节点或自定义应用程序流量Edge-node or custom application traffic

上述步骤可让群集正常运行。The above steps will allow the cluster to operate without issues. 但在可能的情况下,你仍然需要配置依赖项,以适应边缘节点上运行的自定义应用程序。You still need to configure dependencies to accommodate your custom applications running on the edge-nodes, if applicable.

必须识别应用程序依赖项并将其添加到 Azure 防火墙或路由表。Application dependencies must be identified and added to the Azure Firewall or the route table.

必须为应用程序流量创建路由,以避免非对称路由问题。Routes must be created for the application traffic to avoid asymmetric routing issues.

如果应用程序有其他依赖项,则需要将这些依赖项添加到 Azure 防火墙。If your applications have other dependencies, they need to be added to your Azure Firewall. 创建允许 HTTP/HTTPS 流量的应用程序规则,并针对其他方面的控制创建网络规则。Create Application rules to allow HTTP/HTTPS traffic and Network rules for everything else.

日志记录和缩放Logging and scale

Azure 防火墙可将日志发送到一些不同的存储系统。Azure Firewall can send logs to a few different storage systems. 有关为防火墙配置日志记录的说明,请遵循以下文章中的步骤:教程:监视 Azure 防火墙日志和指标For instructions on configuring logging for your firewall, follow the steps in Tutorial: Monitor Azure Firewall logs and metrics.

完成日志记录设置后,如果使用 Log Analytics,则可以使用以下查询查看已阻止的流量:Once you've completed the logging setup, if you're using Log Analytics, you can view blocked traffic with a query such as:

AzureDiagnostics | where msg_s contains "Deny" | where TimeGenerated >= ago(1h)

首次运行应用程序时,将 Azure 防火墙与 Azure Monitor 日志集成会很有用。Integrating Azure Firewall with Azure Monitor logs is useful when first getting an application working. 尤其当你不知道所有应用程序依赖项时。Especially when you aren't aware of all of the application dependencies. 可以通过在 Azure Monitor 中分析日志数据详细了解 Azure Monitor 日志You can learn more about Azure Monitor logs from Analyze log data in Azure Monitor

若要了解 Azure 防火墙的缩放限制以及如何提高请求,请参阅此文档或参阅常见问题解答To learn about the scale limits of Azure Firewall and request increases, see this document or refer to the FAQs.

访问群集Access to the cluster

成功设置防火墙后,可以使用内部终结点 (https://CLUSTERNAME-int.azurehdinsight.cn) 从虚拟网络内部访问 Ambari。After having the firewall set up successfully, you can use the internal endpoint (https://CLUSTERNAME-int.azurehdinsight.cn) to access the Ambari from inside the virtual network.

若要使用公共终结点 (https://CLUSTERNAME.azurehdinsight.cn) 或 ssh 终结点 (CLUSTERNAME-ssh.azurehdinsight.cn),请确保路由表和 NSG 规则中具有正确的路由,以避免出现此处所述的非对称路由问题。To use the public endpoint (https://CLUSTERNAME.azurehdinsight.cn) or ssh endpoint (CLUSTERNAME-ssh.azurehdinsight.cn), make sure you have the right routes in the route table and NSG rules to avoid the asymmetric routing issue explained here. 具体而言,在这种情况下,需要允许入站 NSG 规则中的客户端 IP 地址,并在将下一跃点设置为 internet 的情况下,将此地址添加到用户定义的路由表中。Specifically in this case, you need to allow the client IP address in the Inbound NSG rules and also add it to the user-defined route table with the next hop set as internet. 如果未正确设置路由,则会显示超时错误。If the routing isn't set up correctly, you'll see a timeout error.

后续步骤Next steps