以并发方式运行任务以最大程度地利用 Batch 计算节点Run tasks concurrently to maximize usage of Batch compute nodes

通过在 Azure Batch 池中的每个计算节点上同时运行多个任务,可在池中的较少节点上最大程度利用资源。By running more than one task simultaneously on each compute node in your Azure Batch pool, you can maximize resource usage on a smaller number of nodes in the pool. 对于某些工作负荷,这可以缩短作业时间并降低成本。For some workloads, this can result in shorter job times and lower cost.

尽管在某些情况下,将一个节点的所有资源专用于单个任务会更有利,但在多数情况下,最好是让多个任务共享这些资源:While some scenarios benefit from dedicating all of a node's resources to a single task, several situations benefit from allowing multiple tasks to share those resources:

  • 尽量减少数据传输:适用于任务可以共享数据的情况。Minimizing data transfer when tasks are able to share data. 在此方案中,将共享数据复制到较小数目的节点并在每个节点上并行执行任务可以大大减少数据传输费用,In this scenario, you can dramatically reduce data transfer charges by copying shared data to a smaller number of nodes and executing tasks in parallel on each node. 尤其是在复制到每个节点的数据必须跨地理区域传输的情况下。This especially applies if the data to be copied to each node must be transferred between geographic regions.
  • 尽量增加内存使用:适用于任务需要大量的内存,但这种需要仅在执行过程中短时出现且时间不固定的情况。Maximizing memory usage when tasks require a large amount of memory, but only during short periods of time, and at variable times during execution. 可以减少计算节点的数量但增加其大小,同时提供更多的内存,以便有效地应对此类高峰负载。You can employ fewer, but larger, compute nodes with more memory to efficiently handle such spikes. 这些节点会在每个节点上并行运行多个任务,而每个任务都会充分利用节点在不同时间的大量内存。These nodes would have multiple tasks running in parallel on each node, but each task would take advantage of the nodes' plentiful memory at different times.
  • 减少节点数目限制 :适用于需要在池中进行节点间通信的情况。Mitigating node number limits when inter-node communication is required within a pool. 目前,经过配置可以进行节点间通信的池仅限 50 个计算节点。Currently, pools configured for inter-node communication are limited to 50 compute nodes. 如果此类池中的每个节点都可以并行执行任务,则可同时执行更大数目的任务。If each node in such a pool is able to execute tasks in parallel, a greater number of tasks can be executed simultaneously.
  • 复制本地计算群集:适用于首次将计算环境移至 Azure 等情况。Replicating an on-premises compute cluster, such as when you first move a compute environment to Azure. 如果当前的本地解决方案在每个计算节点上执行多个任务,则可通过增大节点任务的最大数目来更紧密地完成该配置的镜像操作。If your current on-premises solution executes multiple tasks per compute node, you can increase the maximum number of node tasks to more closely mirror that configuration.

示例方案Example scenario

为了举例说明并行任务执行的好处,假设根据任务应用程序的 CPU 和内存要求,Standard_D1 节点是足够的。As an example to illustrate the benefits of parallel task execution, let's say that your task application has CPU and memory requirements such that Standard_D1 nodes are sufficient. 但若要在所需时间内完成作业,则需使用 1,000 个这样的节点。But, in order to finish the job in the required time, 1,000 of these nodes are needed.

如果不使用具有 1 个 CPU 内核的 Standard_D1 节点,则可使用每个具有 16 个内核的 Standard_D14 节点,同时允许并行执行任务。Instead of using Standard_D1 nodes that have 1 CPU core, you could use Standard_D14 nodes that have 16 cores each, and enable parallel task execution. 因此,可以使用 1/16 的节点,即只需使用 63 个节点,而无需使用 1,000 个节点。Therefore, 16 times fewer nodes could be used--instead of 1,000 nodes, only 63 would be required. 此外,如果每个节点需要大型应用程序文件或引用数据,作业持续时间和效率将再次得到提升,因为数据仅复制到 63 个节点。Additionally, if large application files or reference data are required for each node, job duration and efficiency are again improved since the data is copied to only 63 nodes.

允许并行执行任务Enable parallel task execution

可以对计算节点进行配置,在池级别并行执行任务。You configure compute nodes for parallel task execution at the pool level. 在创建池时,可以通过 Batch .NET 库设置 CloudPool.MaxTasksPerComputeNode 属性。With the Batch .NET library, set the CloudPool.MaxTasksPerComputeNode property when you create a pool. 如果使用的是 Batch REST API,则可在创建池时在请求正文中设置 maxTasksPerNode 元素。If you are using the Batch REST API, set the maxTasksPerNode element in the request body during pool creation.

Azure Batch 允许你将每个节点的任务数最多设置为(4 倍)核心节点数。Azure Batch allows you to set tasks per node up to (4x) the number of core nodes. 例如,如果将池的节点大小配置为“大型”(四核),则可将 maxTasksPerNode 设置为 16。For example, if the pool is configured with nodes of size "Large" (four cores), then maxTasksPerNode may be set to 16. 但是,无论节点有多少个核心,每个节点的任务数都不能超过 256 个。However, regardless of how many cores the node has, you can't have more than 256 tasks per node. 有关每个节点大小的核心数的详细信息,请参阅云服务的大小For details on the number of cores for each of the node sizes, see Sizes for Cloud Services. 有关服务限制的详细信息,请参阅 Azure Batch 服务的配额和限制For more information on service limits, see Quotas and limits for the Azure Batch service.

Tip

为池构造自动缩放公式时,请务必考虑 maxTasksPerNode 值。Be sure to take into account the maxTasksPerNode value when you construct an autoscale formula for your pool. 例如,如果增加每个节点的任务数,则可能会极大地影响对 $RunningTasks 求值的公式。For example, a formula that evaluates $RunningTasks could be dramatically affected by an increase in tasks per node. 有关详细信息,请参阅自动缩放 Azure Batch 池中的计算节点See Automatically scale compute nodes in an Azure Batch pool for more information.

任务分发Distribution of tasks

当池中的计算节点可以并行执行任务时,请务必指定任务在池中各节点之间的分布方式。When the compute nodes in a pool can execute tasks concurrently, it's important to specify how you want the tasks to be distributed across the nodes in the pool.

可以通过 CloudPool.TaskSchedulingPolicy 属性指定任务,即让任务在池中所有节点之间平均分配(“散布式”)。By using the CloudPool.TaskSchedulingPolicy property, you can specify that tasks should be assigned evenly across all nodes in the pool ("spreading"). 或者,先给池中的每个节点分配尽量多的任务,此后再将任务分配给池中的其他节点(“装箱式”)。Or you can specify that as many tasks as possible should be assigned to each node before tasks are assigned to another node in the pool ("packing").

此功能十分重要,如需示例,请参阅上面示例中 Standard_D14 节点的池,该池配置后的 CloudPool.MaxTasksPerComputeNode 值为 16。As an example of how this feature is valuable, consider the pool of Standard_D14 nodes (in the example above) that is configured with a CloudPool.MaxTasksPerComputeNode value of 16. 如果对 CloudPool.TaskSchedulingPolicy 进行配置时,将 ComputeNodeFillType 设置为 Pack,则会充分使用每个节点的所有 16 个核心,并可通过自动缩放池将不使用的节点(没有分配任何任务的节点)从池中删除。If the CloudPool.TaskSchedulingPolicy is configured with a ComputeNodeFillType of Pack, it would maximize usage of all 16 cores of each node and allow an autoscaling pool to prune unused nodes from the pool (nodes without any tasks assigned). 这可以最大程度地减少资源使用量并节省资金。This minimizes resource usage and saves money.

Batch .NET 示例Batch .NET example

Batch .NET API 代码片段演示了一个请求,该请求要求创建一个包含四个节点的池,每个节点最多四个任务。This Batch .NET API code snippet shows a request to create a pool that contains four nodes with a maximum of four tasks per node. 它指定了一个任务计划策略,要求先用任务填充一个节点,此后再将任务分配给池中的其他节点。It specifies a task scheduling policy that will fill each node with tasks prior to assigning tasks to another node in the pool. 有关如何使用 Batch .NET API 添加池的详细信息,请参阅 BatchClient.PoolOperations.CreatePoolFor more information on adding pools by using the Batch .NET API, see BatchClient.PoolOperations.CreatePool.

CloudPool pool =
    batchClient.PoolOperations.CreatePool(
        poolId: "mypool",
        targetDedicatedComputeNodes: 4
        virtualMachineSize: "standard_d1_v2",
        cloudServiceConfiguration: new CloudServiceConfiguration(osFamily: "5"));

pool.MaxTasksPerComputeNode = 4;
pool.TaskSchedulingPolicy = new TaskSchedulingPolicy(ComputeNodeFillType.Pack);
pool.Commit();

Batch REST 示例Batch REST example

Batch REST API 代码片段演示了一个请求,该请求要求创建一个包含两个大型节点的池,每个节点最多四个任务。This Batch REST API snippet shows a request to create a pool that contains two large nodes with a maximum of four tasks per node. 有关如何使用 REST API 添加池的详细信息,请参阅 Add a pool to an account(将池添加到帐户)。For more information on adding pools by using the REST API, see Add a pool to an account.

{
  "odata.metadata":"https://myaccount.myregion.batch.chinacloudapi.cn/$metadata#pools/@Element",
  "id":"mypool",
  "vmSize":"large",
  "cloudServiceConfiguration": {
    "osFamily":"4",
    "targetOSVersion":"*",
  }
  "targetDedicatedComputeNodes":2,
  "maxTasksPerNode":4,
  "enableInterNodeCommunication":true,
}

Note

只能在创建池时设置 maxTasksPerNode 元素和 MaxTasksPerComputeNode 属性。You can set the maxTasksPerNode element and MaxTasksPerComputeNode property only at pool creation time. 创建完池以后,不能对上述元素和属性进行修改。They cannot be modified after a pool has already been created.

代码示例Code sample

GitHub 上的 ParallelNodeTasks 项目说明了如何使用 CloudPool.MaxTasksPerComputeNode 属性。The ParallelNodeTasks project on GitHub illustrates the use of the CloudPool.MaxTasksPerComputeNode property.

此 C# 控制台应用程序使用 Batch .NET 库创建包含一个或多个计算节点的池。This C# console application uses the Batch .NET library to create a pool with one or more compute nodes. 并在这些节点上执行其数量可以配置的任务,以便模拟可变负荷。It executes a configurable number of tasks on those nodes to simulate variable load. 应用程序的输出指定了哪些节点执行了每个任务。Output from the application specifies which nodes executed each task. 该应用程序还提供了作业参数和持续时间的摘要。The application also provides a summary of the job parameters and duration. 下面显示了同一个应用程序运行两次后的输出摘要部分。The summary portion of the output from two different runs of the sample application appears below.

Nodes: 1
Node size: large
Max tasks per node: 1
Tasks: 32
Duration: 00:30:01.4638023

第一次执行示例应用程序时,结果显示,在池中只有一个节点且使用默认的一个节点一个任务设置的情况下,作业持续时间超过 30 分钟。The first execution of the sample application shows that with a single node in the pool and the default setting of one task per node, the job duration is over 30 minutes.

Nodes: 1
Node size: large
Max tasks per node: 4
Tasks: 32
Duration: 00:08:48.2423500

第二次运行示例应用程序时,显示作业持续时间显著缩短。The second run of the sample shows a significant decrease in job duration. 这是因为该池已被配置为每个节点四个任务,因此可以并行执行任务,使得作业可以在大约四分之一的时间内完成。This is because the pool was configured with four tasks per node, which allows for parallel task execution to complete the job in nearly a quarter of the time.

Note

上述摘要中的作业持续时间不包括创建池的时间。The job durations in the summaries above do not include pool creation time. 上述每个作业都提交到此前已创建的池,这些池的计算节点在提交时处于 空闲 状态。Each of the jobs above was submitted to previously created pools whose compute nodes were in the Idle state at submission time.

后续步骤Next steps

Batch 资源管理器热度地图Batch Explorer Heat Map

Batch Explorer 是一个功能丰富的免费独立客户端工具,可帮助创建、调试和监视 Azure Batch 应用程序。Batch Explorer is a free, rich-featured, standalone client tool to help create, debug, and monitor Azure Batch applications. Batch Explorer 包含“热度地图”功能,可提供任务执行的可视化效果。Batch Explorer contains a Heat Map feature that provides visualization of task execution. 执行 ParallelTasks 示例应用程序时,可以使用“热度地图”功能轻松可视化每个节点上并行任务的执行。When you're executing the ParallelTasks sample application, you can use the Heat Map feature to easily visualize the execution of parallel tasks on each node.