在虚拟网络中创建 Azure Batch 池Create an Azure Batch pool in a virtual network

创建 Azure Batch 池时,可以在指定的 Azure 虚拟 网络 (VNet) 的子网中预配该池。When you create an Azure Batch pool, you can provision the pool in a subnet of an Azure virtual network (VNet) that you specify. 本文介绍了如何在 VNet 中设置 Batch 池。This article explains how to set up a Batch pool in a VNet.

为何使用 VNet?Why use a VNet?

池中的计算节点可以相互进行通信,例如为了运行多实例任务,而无需单独的 VNet。Compute nodes in a pool can communicate with each other, such as to run multi-instance tasks, without requiring a separate VNet. 但是,默认情况下,池中的节点不能与池外的虚拟机(例如许可证服务器或文件服务器)进行通信。However, by default, nodes in a pool can't communicate with virtual machines that are outside of the pool, such as license servers or a file servers.

若要允许计算节点安全地与其他虚拟机或本地网络进行通信,可以在 Azure VNet 的子网中预配该池。To allow compute nodes to communicate securely with other virtual machines, or with an on-premises network, you can provision the pool in a subnet of an Azure VNet.

先决条件Prerequisites

  • “身份验证”。Authentication. 若要使用 Azure VNet,Batch 客户端 API 必须使用 Azure Active Directory (AD) 身份验证。To use an Azure VNet, the Batch client API must use Azure Active Directory (AD) authentication. 有关 Azure AD 的 Azure Batch 支持,请参阅使用 Active Directory 对 Batch 服务解决方案进行身份验证Azure Batch support for Azure AD is documented in Authenticate Batch service solutions with Active Directory.

  • 一个 Azure VNetAn Azure VNet. 参阅以下部分,了解 VNet 要求和配置。See the following section for VNet requirements and configuration. 若要提前准备具有一个或多个子网的 VNet,可以使用 Azure 门户、Azure PowerShell、Azure 命令行接口 (CLI) 或其他方法。To prepare a VNet with one or more subnets in advance, you can use the Azure portal, Azure PowerShell, the Azure Command-Line Interface (CLI), or other methods.

    • 若要创建基于 Azure 资源管理器的 VNet,请参阅创建虚拟网络To create an Azure Resource Manager-based VNet, see Create a virtual network. 推荐将基于资源管理器的 VNet 用于新部署,它是采用虚拟机配置的池支持的唯一选项。A Resource Manager-based VNet is recommended for new deployments, and is supported only on pools that use Virtual Machine Configuration.
    • 若要创建经典 VNet,请参阅 Create a virtual network (classic) with multiple subnets(创建具有多个子网的虚拟网络(经典))。To create a classic VNet, see Create a virtual network (classic) with multiple subnets. 仅使用云服务配置的池支持经典 VNet。A classic VNet is supported only on pools that use Cloud Services Configuration.

VNet 要求VNet requirements

一般要求General requirements

  • VNet 必须与用于创建池的 Batch 帐户位于同一订阅和区域中。The VNet must be in the same subscription and region as the Batch account you use to create your pool.

  • 使用 VNet 的池最多可以有 4096 个节点。The pool using the VNet can have a maximum of 4096 nodes.

  • 为池指定的子网必须提供足够的未分配 IP 地址来容纳面向该池的 VM 的数量;即,池的 targetDedicatedNodestargetLowPriorityNodes 属性的总和。The subnet specified for the pool must have enough unassigned IP addresses to accommodate the number of VMs targeted for the pool; that is, the sum of the targetDedicatedNodes and targetLowPriorityNodes properties of the pool. 如果子网没有足够的未分配 IP 地址,池将分配部分计算节点,并发生调整大小错误。If the subnet doesn't have enough unassigned IP addresses, the pool partially allocates the compute nodes, and a resize error occurs.

  • 需要通过为 VNet 提供服务的自定义 DNS 服务器解析 Azure 存储终结点。Your Azure Storage endpoint needs to be resolved by any custom DNS servers that serve your VNet. 具体而言,<account>.table.core.chinacloudapi.cn<account>.queue.core.chinacloudapi.cn<account>.blob.core.chinacloudapi.cn 形式的 URL 应当是可以解析的。Specifically, URLs of the form <account>.table.core.chinacloudapi.cn, <account>.queue.core.chinacloudapi.cn, and <account>.blob.core.chinacloudapi.cn should be resolvable.

其他 VNet 要求会有所不同,具体取决于 Batch 池是使用“虚拟机”配置还是使用“云服务”配置。Additional VNet requirements differ, depending on whether the Batch pool is in the Virtual Machine configuration or the Cloud Services configuration. 若要进行新的池部署(部署到 VNet 中),建议使用“虚拟机”配置。For new pool deployments into a VNet, the Virtual Machine configuration is recommended.

“虚拟机”配置中的池Pools in the Virtual Machine configuration

支持的 VNet - 仅限基于 Azure 资源管理器的 VNetSupported VNets - Azure Resource Manager-based VNets only

子网 ID - 通过 Batch API 指定子网时,请使用子网的资源标识符。Subnet ID - When specifying the subnet using the Batch APIs, use the resource identifier of the subnet. 标识符的形式为:The subnet identifier is of the form:

/subscriptions/{subscription}/resourceGroups/{group}/providers/Microsoft.Network/virtualNetworks/{network}/subnets/{subnet}

权限 - 检查在 VNet 的订阅或资源组上实施的安全策略或锁定是否限制用户管理 VNet 所需的权限。Permissions - Check whether your security policies or locks on the VNet's subscription or resource group restrict a user's permissions to manage the VNet.

其他网络资源 - Batch 自动在包含 VNet 的资源组中分配其他网络资源。Additional networking resources - Batch automatically allocates additional networking resources in the resource group containing the VNet.

重要

对于每 100 个专用或低优先级节点,Batch 会分配:1 个网络安全组 (NSG)、1 个公共 IP 地址、1 个负载均衡器。For each 100 dedicated or low-priority nodes, Batch allocates: one network security group (NSG), one public IP address, and one load balancer. 这些资源受订阅的资源配额限制。These resources are limited by the subscription's resource quotas. 对于大型池,可能需要为一个或多个此类资源请求增加配额。For large pools, you might need to request a quota increase for one or more of these resources.

网络安全组:Batch 默认值Network security groups: Batch default

子网必须允许来自 Batch 服务的入站通信,才能在计算节点上计划任务,必须允许出站通信,才能根据工作负荷需求与 Azure 存储或其他资源通信。The subnet must allow inbound communication from the Batch service to be able to schedule tasks on the compute nodes, and outbound communication to communicate with Azure Storage or other resources as needed by your workload. 对于“虚拟机”配置中的池,Batch 在附加到计算节点的网络接口 (NIC) 级别添加 NSG。For pools in the Virtual Machine configuration, Batch adds NSGs at the network interfaces (NICs) level attached to compute nodes. 这些 NSG 配置了以下附加规则:These NSGs are configured with the following additional rules:

  • 端口 29876 和 29877 上来自 Batch 服务 IP 地址(对应于 BatchNodeManagement 服务标记)的入站 TCP 流量。Inbound TCP traffic on ports 29876 and 29877 from Batch service IP addresses that correspond to the BatchNodeManagement service tag.
  • 端口 22(Linux 节点)或端口 3389(Windows 节点)上允许远程访问的入站 TCP 流量。Inbound TCP traffic on port 22 (Linux nodes) or port 3389 (Windows nodes) to permit remote access. 对于 Linux 上某些类型的多实例任务(如 MPI),还需要为包含 Batch 计算节点的子网中的 IP 允许 SSH 端口 22 流量。For certain types of multi-instance tasks on Linux (such as MPI), you will need to also allow SSH port 22 traffic for IPs in the subnet containing the Batch compute nodes. 这可能会根据子网级 NSG 规则进行阻止(请参阅下文)。This may be blocked per subnet-level NSG rules (see below).
  • 任何端口上通往虚拟网络的出站流量。Outbound traffic on any port to the virtual network. 这可能会根据子网级 NSG 规则进行修改(请参阅下文)。This may be amended per subnet-level NSG rules (see below).
  • 任何端口上通往 Internet 的出站流量。Outbound traffic on any port to the Internet. 这可能会根据子网级 NSG 规则进行修改(请参阅下文)。This may be amended per subnet-level NSG rules (see below).

重要

在 Batch 配置的 NSG 中修改或添加入站或出站规则时,请务必小心。Use caution if you modify or add inbound or outbound rules in Batch-configured NSGs. 如果 NSG 拒绝与指定子网中的计算节点通信,则 Batch 服务会将计算节点的状态设置为“不可用”。If communication to the compute nodes in the specified subnet is denied by an NSG, the Batch service will set the state of the compute nodes to unusable. 此外,不得将资源锁应用于 Batch 创建的任何资源,因为这可能会由于用户启动的操作(如删除池)而导致资源清理被阻止。Additionally, no resource locks should be applied to any resource created by Batch, since this can prevent cleanup of resources as a result of user-initiated actions such as deleting a pool.

网络安全组:指定子网级规则Network security groups: Specifying subnet-level rules

无需在子网级别指定 NSG,因为 Batch 会配置其自己的 NSG(请参阅上文)。You don't have to specify NSGs at the virtual network subnet level, because Batch configures its own NSGs (see above). 如果你的一个 NSG 与部署了 Batch 计算节点的子网关联,或者你要应用自定义 NSG 规则来替代应用的默认值,则必须为此 NSG 至少配置入站和出站安全规则,如下表所示。If you have an NSG associated with the subnet where Batch compute nodes are deployed, or if you would like to apply custom NSG rules to override the defaults applied, you must configure this NSG with at least the inbound and outbound security rules shown in the following tables.

在端口 3389 (Windows) 或 22 (Linux) 上配置入站流量的前提是,你需要允许对外部源中的计算节点进行远程访问。Configure inbound traffic on port 3389 (Windows) or 22 (Linux) only if you need to permit remote access to the compute nodes from outside sources. 如果需要支持使用某些 MPI 运行时的多实例任务,则可能需要在 Linux 上启用端口 22 规则。You may need to enable port 22 rules on Linux if you require support for multi-instance tasks with certain MPI runtimes. 使池计算节点可用不一定需要允许这些端口上的流量。Allowing traffic on these ports is not strictly required for the pool compute nodes to be usable.

入站安全规则Inbound security rules

源 IP 地址Source IP addresses 源服务标记Source service tag 源端口Source ports 目标Destination 目标端口Destination ports 协议Protocol 操作Action
空值N/A BatchNodeManagement 服务标记(如果使用区域变体,则在与 Batch 帐户相同的区域中)BatchNodeManagement Service tag (if using regional variant, in the same region as your Batch account) * AnyAny 29876-2987729876-29877 TCPTCP 允许Allow
用户源 IP,用于远程访问 Linux 多实例任务的计算节点和/或计算节点子网(如果需要)。User source IPs for remotely accessing compute nodes and/or compute node subnet for Linux multi-instance tasks, if required. 空值N/A * AnyAny 3389 (Windows)、22 (Linux)3389 (Windows), 22 (Linux) TCPTCP 允许Allow

警告

Batch 服务 IP 地址随时可能会更改。Batch service IP addresses can change over time. 因此,强烈建议对 NSG 规则使用 BatchNodeManagement 服务标记(或区域变体)。Therefore, it is highly recommended to use the BatchNodeManagement service tag (or regional variant) for NSG rules. 避免用特定 Batch 服务 IP 地址填充 NSG 规则。Avoid populating NSG rules with specific Batch service IP addresses.

出站安全规则Outbound security rules

Source 源端口Source ports 目标Destination 目标服务标记Destination service tag 目标端口Destination ports 协议Protocol 操作Action
AnyAny * 服务标记Service tag Storage(如果使用区域变体,则在与 Batch 帐户相同的区域中)Storage (if using regional variant, in the same region as your Batch account) 443443 TCPTCP 允许Allow

“云服务”配置中的池Pools in the Cloud Services configuration

支持的 VNet - 仅限经典 VNetSupported VNets - Classic VNets only

子网 ID - 通过 Batch API 指定子网时,请使用子网的资源标识符。Subnet ID - When specifying the subnet using the Batch APIs, use the resource identifier of the subnet. 标识符的形式为:The subnet identifier is of the form:

/subscriptions/{subscription}/resourceGroups/{group}/providers/Microsoft.ClassicNetwork /virtualNetworks/{network}/subnets/{subnet}

权限 - Microsoft Azure Batch 服务主体必须为指定的 VNet 提供 Classic Virtual Machine Contributor Azure 角色。Permissions - The Microsoft Azure Batch service principal must have the Classic Virtual Machine Contributor Azure role for the specified VNet.

网络安全组Network security groups

子网必须允许来自 Batch 服务的入站通信,才能在计算节点上计划任务,必须允许出站通信,才能与 Azure 存储或其他资源通信。The subnet must allow inbound communication from the Batch service to be able to schedule tasks on the compute nodes, and outbound communication to communicate with Azure Storage or other resources.

不需指定 NSG,因为 Batch 将入站通信配置为只能从 Batch IP 地址到池节点。You do not need to specify an NSG, because Batch configures inbound communication only from Batch IP addresses to the pool nodes. 但是,如果指定的子网具有关联的 NSG 和/或防火墙,则配置入站和出站安全规则,如以下各表中所示。However, If the specified subnet has associated NSGs and/or a firewall, configure the inbound and outbound security rules as shown in the following tables. 如果 NSG 拒绝与指定子网中的计算节点通信,则 Batch 服务会将计算节点的状态设置为“不可用”。If communication to the compute nodes in the specified subnet is denied by an NSG, the Batch service sets the state of the compute nodes to unusable.

如果需要允许对池节点进行 RDP 访问,请在端口 3389 上为 Windows 配置入站流量。Configure inbound traffic on port 3389 for Windows if you need to permit RDP access to the pool nodes. 无需此项即可使用池节点。This is not required for the pool nodes to be usable.

入站安全规则Inbound security rules

源 IP 地址Source IP addresses 源端口Source ports 目标Destination 目标端口Destination ports 协议Protocol 操作Action
AnyAny

虽然这需要有效地“全部允许”,但 Batch 服务会在每个节点级别应用 ACL 规则,以筛选掉所有非 Batch 服务 IP 地址。Although this requires effectively "allow all", the Batch service applies an ACL rule at the level of each node that filters out all non-Batch service IP addresses.
* AnyAny 10100、20100、3010010100, 20100, 30100 TCPTCP 允许Allow
可选,用于允许对计算节点进行 RDP 访问。Optional, to allow RDP access to compute nodes. * AnyAny 33893389 TCPTCP 允许Allow

出站安全规则Outbound security rules

Source 源端口Source ports 目标Destination 目标端口Destination ports 协议Protocol 操作Action
任意Any * 任意Any 443443 AnyAny AllowAllow

在 Azure 门户中创建具有 VNet 的池Create a pool with a VNet in the Azure portal

在创建 VNet 并将一个子网分配给它后,可以使用该 VNet 创建 Batch 池。Once you have created your VNet and assigned a subnet to it, you can create a Batch pool with that VNet. 请按照下列步骤在 Azure 门户中创建池:Follow these steps to create a pool from the Azure portal: 

  1. 导航到 Azure 门户中的批处理帐户。Navigate to your Batch account in the Azure portal. 此帐户必须与要使用的 VNet 所在的资源组位于同一订阅和区域中。This account must be in the same subscription and region as the resource group containing the VNet you intend to use.

  2. 在左侧的“设置”窗口中,选择“池”菜单项。In the Settings window on the left, select the Pools menu item.

  3. 在“池”窗口中,选择“添加”。 In the Pools window, select Add.

  4. 在“添加池”窗口中,从“映像类型”下拉列表中选择要使用的选项。 On the Add Pool window, select the option you intend to use from the Image Type dropdown.

  5. 为自定义映像选择正确的“发布服务器/产品/SKU”。Select the correct Publisher/Offer/Sku for your custom image.

  6. 指定剩余所需设置,包括“节点大小”和“目标专用节点”,以及任何所需的可选设置 。Specify the remaining required settings, including the Node size, and Target dedicated nodes, as well as any desired optional settings.

  7. 在“虚拟网络”中,选择要使用的虚拟网络和子网。In Virtual Network, select the virtual network and subnet you wish to use.

    使用虚拟网络添加池

用户定义的用于强制隧道的路由User-defined routes for forced tunneling

你的组织可能会要求将 Internet 绑定的流量从子网重定向(强制)回本地位置以进行检查和日志记录。You might have requirements in your organization to redirect (force) internet-bound traffic from the subnet back to your on-premises location for inspection and logging. 此外,你可能已针对 VNet 中的子网启用了强制隧道。Additionally, you may have enabled forced tunneling for the subnets in your VNet.

若要确保池中的节点在启用了强制隧道的 VNet 中工作,必须为该子网添加以下用户定义的路由 (UDR):To ensure that the nodes in your pool work in a VNet that has forced tunneling enabled, you must add the following user-defined routes (UDR) for that subnet:

  • Batch 服务需要与节点进行通信来计划任务。The Batch service needs to communicate with nodes for scheduling tasks. 若要启用此通信,请在你的 Batch 帐户所在的区域中为 Batch 服务使用的每个 IP 地址添加一个 UDR。To enable this communication, add a UDR for each IP address used by the Batch service in the region where your Batch account exists. 若要获取 Batch 服务的 IP 地址列表,请参阅本地的服务标记To obtain the list of IP addresses of the Batch service, see Service tags on-premises.

  • 确保发送到 Azure 存储(具体而言是采用 <account>.table.core.chinacloudapi.cn<account>.queue.core.chinacloudapi.cn<account>.blob.core.chinacloudapi.cn 格式的 URL)的出站流量没有被本地网络阻止。Ensure that outbound traffic to Azure Storage (specifically, URLs of the form <account>.table.core.chinacloudapi.cn, <account>.queue.core.chinacloudapi.cn, and <account>.blob.core.chinacloudapi.cn) is not blocked by your on-premises network.

添加 UDR 时,请为每个相关 Batch IP 地址前缀定义路由,并将“下一个跃点类型”设置为“Internet” 。When you add a UDR, define the route for each related Batch IP address prefix, and set Next hop type to Internet.

用户定义的路由

警告

Batch 服务 IP 地址随时可能会更改。Batch service IP addresses can change over time. 为了防止由于 IP 地址更改而造成服务中断的情况出现,请创建一个进程以自动刷新 Batch 服务 IP 地址,并使这些地址在路由表中保持最新状态。To prevent outages due to an IP address change, create a process to refresh Batch service IP addresses automatically and keep them up to date in your route table.

后续步骤Next steps