使用 Visual Studio 项目模板快速启动 Batch 解决方案Use Visual Studio project templates to jump-start Batch solutions

Batch 的作业管理器任务处理器 Visual Studio 模板提供代码,从而帮助以最少的精力在 Batch 上实现并运行计算密集型工作负荷。The Job Manager and Task Processor Visual Studio templates for Batch provide code to help you to implement and run your compute-intensive workloads on Batch with the least amount of effort. 本文档介绍这些模板,并提供其用法指导。This document describes these templates and provides guidance for how to use them.

重要

本文只介绍适用于这两个模板的信息,假设读者熟悉与其相关的 Batch 服务和重要概念:池、计算节点、作业和任务、作业管理器任务、环境变量和其他相关信息。This article discusses only information applicable to these two templates, and assumes that you are familiar with the Batch service and key concepts related to it: pools, compute nodes, jobs and tasks, job manager tasks, environment variables, and other relevant information. 可在 Azure Batch 基础知识面向开发人员的 Batch 功能概述中找到详细信息。You can find more information in Basics of Azure Batch and Batch feature overview for developers.

综合概述High-level overview

作业管理器和任务处理器模板可用于创建两个有用的组件:The Job Manager and Task Processor templates can be used to create two useful components:

  • 作业管理器任务实现作业拆分器,后者可将作业细分为多个可以并行独立运行的任务。A job manager task that implements a job splitter that can break a job down into multiple tasks that can run independently, in parallel.
  • 任务处理器可用于围绕应用程序命令行执行前处理和后处理。A task processor that can be used to perform pre-processing and post-processing around an application command line.

例如,在电影渲染方案中,作业拆分器将单个电影作业转变成数百个或数千个单独处理各个帧的不同任务。For example, in a movie rendering scenario, the job splitter would turn a single movie job into hundreds or thousands of separate tasks that would process individual frames separately. 相应地,任务处理器调用为了渲染每个帧所需的渲染应用程序和所有依赖进程,执行任何额外操作(例如,将渲染的帧复制到存储位置)。Correspondingly, the task processor would invoke the rendering application and all dependent processes that are needed to render each frame, as well as perform any additional actions (for example, copying the rendered frame to a storage location).

备注

作业管理器和任务处理器模板彼此独立,因此可以根据计算作业要求和个人喜好,选择同时使用两者或只使用其中之一。The Job Manager and Task Processor templates are independent of each other, so you can choose to use both, or only one of them, depending on the requirements of your compute job and on your preferences.

如下图所示,使用这些模板的计算作业经历三个阶段:As shown in the diagram below, a compute job that uses these templates will go through three stages:

  1. 客户端代码(例如,应用程序、Web 服务等)将作业提交到 Azure 上的 Batch 服务,将其作业管理器任务指定为作业管理器程序。The client code (e.g., application, web service, etc.) submits a job to the Batch service on Azure, specifying as its job manager task the job manager program.
  2. Batch 服务在计算节点上运行作业管理器任务,作业拆分器根据作业拆分器代码中的参数和规范,在任意数量的所需计算节点上启动指定数量的任务处理器任务。The Batch service runs the job manager task on a compute node and the job splitter launches the specified number of task processor tasks, on as many compute nodes as required, based on the parameters and specifications in the job splitter code.
  3. 任务处理器任务以并行方式独立运行,处理输入数据并生成输出数据。The task processor tasks run independently, in parallel, to process the input data and generate the output data.

显示客户端代码与 Batch 服务交互的示意图

先决条件Prerequisites

若要使用 Batch 模板,需要满足以下条件:To use the Batch templates, you will need the following:

  • 安装有 Visual Studio 2015 的一台计算机。A computer with Visual Studio 2015 installed. Batch 模板当前仅支持 Visual Studio 2015。Batch templates are currently only supported for Visual Studio 2015.

  • Batch 模板,可从 Visual Studio 库以 Visual Studio 扩展的形式获取。The Batch templates, which are available from the Visual Studio Gallery as Visual Studio extensions. 有两种方式可获取模板:There are two ways to get the templates:

    • 使用 Visual Studio 中的“扩展和更新” 对话框安装模板(有关详细信息,请参阅查找和使用 Visual Studio 扩展)。Install the templates using the Extensions and Updates dialog box in Visual Studio (for more information, see Finding and Using Visual Studio Extensions). 在“扩展和更新” 对话框中,搜索并下载以下两个扩展:In the Extensions and Updates dialog box, search and download the following two extensions:

      • 随附作业拆分器的 Azure Batch 作业管理器Azure Batch Job Manager with Job Splitter
      • Azure Batch 任务处理器Azure Batch Task Processor
    • 从 Visual Studio 的联机库下载模板:Microsoft Azure Batch 项目模板Download the templates from the online gallery for Visual Studio: Microsoft Azure Batch Project Templates

  • 如果打算使用应用程序包功能将作业管理器和任务处理器部署到 Batch 计算节点,需要将存储帐户链接到 Batch 帐户。If you plan to use the Application Packages feature to deploy the job manager and task processor to the Batch compute nodes, you need to link a storage account to your Batch account.

准备工作Preparation

建议创建可在其中包含作业管理器和任务处理器的解决方案,因为这样可以更轻松地在作业管理器和任务处理器程序之间共享代码。We recommend creating a solution that can contain your job manager as well as your task processor, because this can make it easier to share code between your job manager and task processor programs. 若要创建此解决方案,请遵循以下步骤:To create this solution, follow these steps:

  1. 打开 Visual Studio,并选择“文件” > “新建” > “项目” 。Open Visual Studio and select File > New > Project.
  2. 在“模板” 下展开“其他项目类型” ,单击“Visual Studio 解决方案” ,并选择“空白解决方案” 。Under Templates, expand Other Project Types, click Visual Studio Solutions, and then select Blank Solution.
  3. 键入用于描述应用程序和此解决方案用途的名称(例如,“LitwareBatchTaskPrograms”)。Type a name that describes your application and the purpose of this solution (e.g., "LitwareBatchTaskPrograms").
  4. 若要创建新解决方案,请单击“确定” 。To create the new solution, click OK.

作业管理器模板Job Manager template

作业管理器模板可帮助实现作业管理器任务以执行以下操作:The Job Manager template helps you to implement a job manager task that can perform the following actions:

  • 将一个作业拆分为多个任务。Split a job into multiple tasks.
  • 提交这些任务以在 Batch 上运行。Submit those tasks to run on Batch.

备注

有关作业管理器任务的详细信息,请参阅面向开发人员的 Batch 功能概述For more information about job manager tasks, see Batch feature overview for developers.

使用模板创建作业管理器Create a Job Manager using the template

若要在前面创建的解决方案中添加作业管理器,请遵循以下步骤:To add a job manager to the solution that you created earlier, follow these steps:

  1. 在 Visual Studio 中打开现有解决方案。Open your existing solution in Visual Studio.
  2. 在解决方案资源管理器中,右键单击解决方案,并单击“添加” > “新建项目” 。In Solution Explorer, right-click the solution, click Add > New Project.
  3. 在“Visual C#” 下单击“云” ,并单击“随附作业拆分器的 Azure Batch 作业管理器” 。Under Visual C#, click Cloud, and then click Azure Batch Job Manager with Job Splitter.
  4. 键入用于描述应用程序并将此项目标识为作业管理器的名称(例如“LitwareJobManager”)。Type a name that describes your application and identifies this project as the job manager (e.g. "LitwareJobManager").
  5. 若要创建项目,请单击“确定” 。To create the project, click OK.
  6. 最后,生成项目来强制 Visual Studio 加载所有引用的 NuGet 包,并验证项目是否有效以便能开始对其进行修改。Finally, build the project to force Visual Studio to load all referenced NuGet packages and to verify that the project is valid before you start modifying it.

作业管理器模板文件及其用途Job Manager template files and their purpose

使用作业管理器模板创建项目时,它生成三组代码文件:When you create a project using the Job Manager template, it generates three groups of code files:

  • 主程序文件 (Program.cs)。The main program file (Program.cs). 此文件包含程序入口点和顶层异常处理。This contains the program entry point and top-level exception handling. 一般情况下,不需要修改此文件。You shouldn't normally need to modify this.
  • 框架目录。The Framework directory. 此目录包含的文件负责处理作业管理器程序执行的“样板”工作,例如解包参数、向 Batch 作业添加任务等。一般情况下,不需要修改这些文件。This contains the files responsible for the 'boilerplate' work done by the job manager program - unpacking parameters, adding tasks to the Batch job, etc. You shouldn't normally need to modify these files.
  • 作业拆分器文件 (JobSplitter.cs)。The job splitter file (JobSplitter.cs). 此处可供存放用于将作业拆分为多个任务的应用程序特定逻辑。This is where you will put your application-specific logic for splitting a job into tasks.

当然,可以根据作业拆分逻辑的复杂性,视需要添加其他文件来支持作业拆分器代码。Of course you can add additional files as required to support your job splitter code, depending on the complexity of the job splitting logic.

该模板还会生成标准 .NET 项目文件,例如 .csproj 文件、app.config、packages.config 等等。The template also generates standard .NET project files such as a .csproj file, app.config, packages.config, etc.

本部分的余下内容介绍不同文件和其代码结构,并解释每个类的用途。The rest of this section describes the different files and their code structure, and explains what each class does.

显示作业管理器模板解决方案的 Visual Studio 解决方案资源管理器

框架文件Framework files

  • Configuration.cs:封装作业配置数据的加载,例如 Batch 帐户详细信息、链接的存储帐户凭据、作业和任务信息,以及作业参数。Configuration.cs: Encapsulates the loading of job configuration data such as Batch account details, linked storage account credentials, job and task information, and job parameters. 它还通过 Configuration.EnvironmentVariable 类提供 Batch 定义的环境变量(请参阅 Batch 文档中“Environment settings for tasks”(任务的环境设置))的访问权限。It also provides access to Batch-defined environment variables (see Environment settings for tasks, in the Batch documentation) via the Configuration.EnvironmentVariable class.
  • IConfiguration.cs:抽象化配置类的实现,以便可以使用虚构或模拟的配置对象对作业拆分器进行单元测试。IConfiguration.cs: Abstracts the implementation of the Configuration class, so that you can unit test your job splitter using a fake or mock configuration object.
  • JobManager.cs:协调作业管理器程序的组件。JobManager.cs: Orchestrates the components of the job manager program. 它负责初始化作业拆分器、调用作业拆分器,以及将作业拆分器返回的任务分派给任务提交器。It is responsible for the initializing the job splitter, invoking the job splitter, and dispatching the tasks returned by the job splitter to the task submitter.
  • JobManagerException.cs:代表需要由作业管理器终止的错误。JobManagerException.cs: Represents an error that requires the job manager to terminate. JobManagerException 用于包装可在终止过程中提供特定诊断信息的“预期”错误。JobManagerException is used to wrap 'expected' errors where specific diagnostic information can be provided as part of termination.
  • TaskSubmitter.cs:此类负责将作业拆分器返回的任务添加到 Batch 作业。TaskSubmitter.cs: This class is responsible to adding tasks returned by the job splitter to the Batch job. JobManager 类将一连串任务聚合成批,以便有效、及时地添加到作业中,然后在每一批的后台线程上调用 TaskSubmitter.SubmitTasks。The JobManager class aggregates the sequence of tasks into batches for efficient but timely addition to the job, then calls TaskSubmitter.SubmitTasks on a background thread for each batch.

作业拆分器Job Splitter

JobSplitter.cs:此类包含用于将作业拆分为多个任务的应用程序特定逻辑。JobSplitter.cs: This class contains application-specific logic for splitting the job into tasks. 框架调用 JobSplitter.Split 方法以获取一连串的任务,并在方法返回任务时将方法添加到作业中。The framework invokes the JobSplitter.Split method to obtain a sequence of tasks, which it adds to the job as the method returns them. 这是会在其中注入作业逻辑的类。This is the class where you will inject the logic of your job. 实现拆分方法,返回一连串代表要将作业拆分成的任务的 CloudTask 实例。Implement the Split method to return a sequence of CloudTask instances representing the tasks into which you want to partition the job.

标准 .NET 命令行项目文件Standard .NET command line project files

  • App.config:标准的 .NET 应用程序配置文件。App.config: Standard .NET application configuration file.
  • Packages.config:标准的 NuGet 包依赖项文件。Packages.config: Standard NuGet package dependency file.
  • Program.cs:包含程序入口点和顶层异常处理。Program.cs: Contains the program entry point and top-level exception handling.

实现作业拆分器Implementing the job splitter

当打开作业管理器模板项目时,项目默认情况下打开 JobSplitter.cs 文件。When you open the Job Manager template project, the project will have the JobSplitter.cs file open by default. 可按如下所示使用 Split() 方法为工作负荷中的任务实现拆分逻辑:You can implement the split logic for the tasks in your workload by using the Split() method show below:

/// <summary>
/// Gets the tasks into which to split the job. This is where you inject
/// your application-specific logic for decomposing the job into tasks.
///
/// The job manager framework invokes the Split method for you; you need
/// only to implement it, not to call it yourself. Typically, your
/// implementation should return tasks lazily, for example using a C#
/// iterator and the "yield return" statement; this allows tasks to be added
/// and to start running while splitting is still in progress.
/// </summary>
/// <returns>The tasks to be added to the job. Tasks are added automatically
/// by the job manager framework as they are returned by this method.</returns>
public IEnumerable<CloudTask> Split()
{
    // Your code for the split logic goes here.
    int startFrame = Convert.ToInt32(_parameters["StartFrame"]);
    int endFrame = Convert.ToInt32(_parameters["EndFrame"]);

    for (int i = startFrame; i <= endFrame; i++)
    {
        yield return new CloudTask("myTask" + i, "cmd /c dir");
    }
}

备注

Split() 方法中,批注部分是作业管理器模板代码中唯一可修改的部分,方法是添加用于将作业拆分成不同任务的逻辑。The annotated section in the Split() method is the only section of the Job Manager template code that is intended for you to modify by adding the logic to split your jobs into different tasks. 如果想要修改模板的其他部分,请确保熟悉 Batch 的工作原理,并先在几个 Batch 代码示例中试试看。If you want to modify a different section of the template, please ensure you are familiarized with how Batch works, and try out a few of the Batch code samples.

Split() 实现具有以下项的访问权限:Your Split() implementation has access to:

  • 作业参数,通过 _parameters 字段。The job parameters, via the _parameters field.
  • 代表作业的 CloudJob 对象,通过 _job 字段。The CloudJob object representing the job, via the _job field.
  • 代表作业管理器任务的 CloudTask 对象,通过 _jobManagerTask 字段。The CloudTask object representing the job manager task, via the _jobManagerTask field.

Split() 实现不需要直接将任务添加到作业中。Your Split() implementation does not need to add tasks to the job directly. 相反地,代码应返回一连串的 CloudTask 对象,并由调用作业拆分器的框架类自动添加到作业中。Instead, your code should return a sequence of CloudTask objects, and these will be added to the job automatically by the framework classes that invoke the job splitter. 通常使用 C# 的迭代器 (yield return) 功能实现作业拆分器,因为这可让任务尽快开始运行,而不是等待所有要计算的任务。It's common to use C#'s iterator (yield return) feature to implement job splitters as this allows the tasks to start running as soon as possible rather than waiting for all tasks to be calculated.

作业拆分器失败Job splitter failure

如果作业拆分器发生错误,它应该:If your job splitter encounters an error, it should either:

  • 使用 C# yield break 语句终止序列,在此情况下,将作业管理器视为成功;或者Terminate the sequence using the C# yield break statement, in which case the job manager will be treated as successful; or
  • 引发异常,在此情况下,将作业管理器视为失败,并可能根据客户端对它的配置进行重试。Throw an exception, in which case the job manager will be treated as failed and may be retried depending on how the client has configured it).

在这两种情况下,作业拆分器已返回并添加到 Batch 作业的任何任务都有资格运行。In both cases, any tasks already returned by the job splitter and added to the Batch job will be eligible to run. 如果不想让此情况发生,可以:If you don't want this to happen, then you could:

  • 终止作业,不让它从作业拆分器返回Terminate the job before returning from the job splitter
  • 先编写整个任务集合再将它返回(也就是返回 ICollection<CloudTask>IList<CloudTask>,而不是使用 C# 迭代器实现作业拆分器)Formulate the entire task collection before returning it (that is, return an ICollection<CloudTask> or IList<CloudTask> instead of implementing your job splitter using a C# iterator)
  • 使用任务依赖项让所有任务依赖于成功完成作业管理器Use task dependencies to make all tasks depend on the successful completion of the job manager

作业管理器重试Job manager retries

根据客户端重试设置,如果作业管理器失败,Batch 服务可能会重试。If the job manager fails, it may be retried by the Batch service depending on the client retry settings. 通常这很安全,因为当框架将任务添加到作业时,会忽略任何已存在的任务。In general, this is safe, because when the framework adds tasks to the job, it ignores any tasks that already exist. 但是,如果计算任务需要很高的成本,可能不希望由于重新计算已添加到作业的任务而生成成本,相反地,如果重新运行不保证生成相同的任务 ID,则“忽略重复项”的行为不会开始运行。However, if calculating tasks is expensive, you may not wish to incur the cost of recalculating tasks that have already been added to the job; conversely, if the re-run is not guaranteed to generate the same task IDs then the 'ignore duplicates' behavior will not kick in. 在这些情况下,应该将作业拆分器设计为检测已完成的任务而不进行重复,例如,通过在开始生成任务之前先运行 CloudJob.ListTasks。In these cases you should design your job splitter to detect the work that has already been done and not repeat it, for example by performing a CloudJob.ListTasks before starting to yield tasks.

作业管理器模板中的退出代码和异常Exit codes and exceptions in the Job Manager template

退出代码和异常提供了机制来确定程序的运行结果,并可帮助找到任何程序执行问题。Exit codes and exceptions provide a mechanism to determine the outcome of running a program, and they can help to identify any problems with the execution of the program. 作业管理器模板实现本部分所述的退出代码和异常。The Job Manager template implements the exit codes and exceptions described in this section.

使用作业管理器模板实现的作业管理器任务返回三个可能的退出代码:A job manager task that is implemented with the Job Manager template can return three possible exit codes:

代码Code 说明Description
00 作业管理器成功完成。The job manager completed successfully. 作业拆分器代码已运行完成,并且所有任务都已添加到作业中。Your job splitter code ran to completion, and all tasks were added to the job.
11 作业管理器任务失败,程序的“预期”部分有异常。The job manager task failed with an exception in an 'expected' part of the program. 异常已转换成 JobManagerException 与诊断信息,如有可能,还提供可解决失败的建议。The exception was translated to a JobManagerException with diagnostic information and, where possible, suggestions for resolving the failure.
22 作业管理器任务失败,发生“意外的”异常。The job manager task failed with an 'unexpected' exception. 异常已记录到标准输出,但作业管理器无法添加任何额外的诊断或补救信息。The exception was logged to standard output, but the job manager was unable to add any additional diagnostic or remediation information.

在作业管理器任务失败的情况下,某些任务可能仍在错误发生之前就已添加到服务中。In the case of job manager task failure, some tasks may still have been added to the service before the error occurred. 这些任务将正常运行。These tasks will run as normal. 请参阅上面的“作业拆分器失败”,获取有关此代码路径的介绍。See "Job Splitter Failure" above for discussion of this code path.

异常返回的所有信息已写入 stdout.txt 和 stderr.txt 文件。All the information returned by exceptions is written into stdout.txt and stderr.txt files. 有关详细信息,请参阅错误处理For more information, see Error Handling.

客户端注意事项Client considerations

本部分说明在根据此模板调用作业管理器时的一些客户端实现要求。This section describes some client implementation requirements when invoking a job manager based on this template. 请参阅 How to pass parameters and environment variables from the client code (如何从客户端代码传递参数和环境变量),获取有关传递参数和环境设置的详细信息。See How to pass parameters and environment variables from the client code for details on passing parameters and environment settings.

必需的凭据Mandatory credentials

要将任务添加到 Azure Batch 作业,作业管理器任务需要 Azure Batch 帐户 URL 和密钥。In order to add tasks to the Azure Batch job, the job manager task requires your Azure Batch account URL and key. 必须在名为 YOUR_BATCH_URL 和 YOUR_BATCH_KEY 的环境变量中传递这些凭据。You must pass these in environment variables named YOUR_BATCH_URL and YOUR_BATCH_KEY. 可以在作业管理器任务的环境设置中设置这些变量。You can set these in the Job Manager task environment settings. 例如,在 C# 客户端中:For example, in a C# client:

job.JobManagerTask.EnvironmentSettings = new [] {
    new EnvironmentSetting("YOUR_BATCH_URL", "https://account.region.batch.chinacloudapi.cn"),
    new EnvironmentSetting("YOUR_BATCH_KEY", "{your_base64_encoded_account_key}"),
};

存储凭据Storage credentials

一般而言,客户端不需要提供链接的存储帐户凭据给作业管理器任务,因为 (a) 大多数作业管理器不需要明确访问链接的存储帐户,并且 (b) 链接的存储帐户通常提供给所有任务,作为作业的通用环境设置。Typically, the client does not need to provide the linked storage account credentials to the job manager task because (a) most job managers do not need to explicitly access the linked storage account and (b) the linked storage account is often provided to all tasks as a common environment setting for the job. 如果未通过通用环境设置提供链接的存储帐户,并且作业管理器需要访问链接的存储,则应按如下所示提供链接的存储凭据:If you are not providing the linked storage account via the common environment settings, and the job manager requires access to linked storage, then you should supply the linked storage credentials as follows:

job.JobManagerTask.EnvironmentSettings = new [] {
    /* other environment settings */
    new EnvironmentSetting("LINKED_STORAGE_ACCOUNT", "{storageAccountName}"),
    new EnvironmentSetting("LINKED_STORAGE_KEY", "{storageAccountKey}"),
};

作业管理器任务设置Job manager task settings

客户端应该将作业管理器的 killJobOnCompletion 标志设置为 falseThe client should set the job manager killJobOnCompletion flag to false.

客户端通常可以安全地将 runExclusive 设置为 false。It is usually safe for the client to set runExclusive to false.

客户端应使用 resourceFiles 或 applicationPackageReferences 集合将作业管理器可执行文件(及其所需的 DLL)部署到计算节点。The client should use the resourceFiles or applicationPackageReferences collection to have the job manager executable (and its required DLLs) deployed to the compute node.

默认情况下,作业管理器在失败时不重试。By default, the job manager will not be retried if it fails. 根据作业管理器逻辑,客户端可能需要通过 constraints/maxTaskRetryCount 启用重试。Depending on your job manager logic, the client may want to enable retries via constraints/maxTaskRetryCount.

作业设置Job settings

如果作业拆分器发出具有依赖项的任务,客户端必须将作业的 usesTaskDependencies 设置为 true。If the job splitter emits tasks with dependencies, the client must set the job's usesTaskDependencies to true.

在作业拆分器模型中,除了作业拆分器所创建的任务外,客户端通常不需要将任务添加到作业中。In the job splitter model, it is unusual for clients to wish to add tasks to jobs over and above what the job splitter creates. 因此一般而言,客户端应该将作业的 onAllTasksComplete 设置为 terminatejobThe client should therefore normally set the job's onAllTasksComplete to terminatejob.

任务处理器模板Task Processor template

任务处理器模板可帮助实现任务处理器来执行以下操作:A Task Processor template helps you to implement a task processor that can perform the following actions:

  • 设置要让每个 Batch 任务运行所需的信息。Set up the information required by each Batch task to run.
  • 运行每个 Batch 任务所需的所有操作。Run all actions required by each Batch task.
  • 将任务输出存储到持久性存储。Save task outputs to persistent storage.

尽管不需要任务处理器就能在 Batch 上运行任务,但使用任务处理器的主要优点是,提供包装器以在一个位置实现所有任务执行操作。Although a task processor is not required to run tasks on Batch, the key advantage of using a task processor is that it provides a wrapper to implement all task execution actions in one location. 例如,如果需要在每个任务的上下文中运行多个应用程序,或者如果需要在完成各项任务之后将数据复制到持久性存储。For example, if you need to run several applications in the context of each task, or if you need to copy data to persistent storage after completing each task.

任务处理器执行的操作可根据工作负荷所需调整为任意复杂性和数量。The actions performed by the task processor can be as simple or complex, and as many or as few, as required by your workload. 此外,通过将所有任务操作实现到一个任务处理器中,可以根据应用程序或工作负荷要求的更改,轻松地更新或添加操作。Additionally, by implementing all task actions into one task processor, you can readily update or add actions based on changes to applications or workload requirements. 但是,在某些情况下,任务处理器可能不是最适合实现的解决方案,因为它增加不必要的复杂性,例如,在运行可以从简单命令行快速启动的作业时。However, in some cases a task processor might not be the optimal solution for your implementation as it can add unnecessary complexity, for example when running jobs that can be quickly started from a simple command line.

使用模板创建任务处理器Create a Task Processor using the template

若要在前面创建的解决方案中添加任务处理器,请遵循以下步骤:To add a task processor to the solution that you created earlier, follow these steps:

  1. 在 Visual Studio 中打开现有解决方案。Open your existing solution in Visual Studio.
  2. 在解决方案资源管理器中,右键单击解决方案,单击“添加”,并单击“新建项目”。In Solution Explorer, right-click the solution, click Add, and then click New Project.
  3. 在“Visual C#”下单击“云”,并单击“Azure Batch 任务处理器”。Under Visual C#, click Cloud, and then click Azure Batch Task Processor.
  4. 键入用于描述应用程序并将此项目标识为任务处理器的名称(例如“LitwareTaskProcessor”)。Type a name that describes your application and identifies this project as the task processor (e.g. "LitwareTaskProcessor").
  5. 若要创建项目,请单击“确定”。To create the project, click OK.
  6. 最后,生成项目来强制 Visual Studio 加载所有引用的 NuGet 包,并验证项目是否有效以便能开始对其进行修改。Finally, build the project to force Visual Studio to load all referenced NuGet packages and to verify that the project is valid before you start modifying it.

任务处理器模板文件及其用途Task Processor template files and their purpose

使用任务处理器模板创建项目时,它生成三组代码文件:When you create a project using the task processor template, it generates three groups of code files:

  • 主程序文件 (Program.cs)。The main program file (Program.cs). 此文件包含程序入口点和顶层异常处理。This contains the program entry point and top-level exception handling. 一般情况下,不需要修改此文件。You shouldn't normally need to modify this.
  • 框架目录。The Framework directory. 此目录包含的文件负责处理作业管理器程序执行的样板工作,例如解压缩参数、在 Batch 作业中添加任务等。一般情况下,不需要修改这些文件。This contains the files responsible for the 'boilerplate' work done by the job manager program - unpacking parameters, adding tasks to the Batch job, etc. You shouldn't normally need to modify these files.
  • 任务处理器文件 (TaskProcessor.cs)。The task processor file (TaskProcessor.cs). 此文件可供存放用于执行任务的应用程序特定逻辑(通常是通过向外调用现有的可执行文件)。This is where you will put your application-specific logic for executing a task (typically by calling out to an existing executable). 预处理和和后处理代码(例如下载额外数据或上传结果文件)也存放在此。Pre- and post-processing code, such as downloading additional data or uploading result files, also goes here.

当然,可以根据作业拆分逻辑的复杂性,视需要添加其他文件来支持任务处理器代码。Of course you can add additional files as required to support your task processor code, depending on the complexity of the job splitting logic.

该模板还会生成标准 .NET 项目文件,例如 .csproj 文件、app.config、packages.config 等等。The template also generates standard .NET project files such as a .csproj file, app.config, packages.config, etc.

本部分的余下内容介绍不同文件和其代码结构,并解释每个类的用途。The rest of this section describes the different files and their code structure, and explains what each class does.

显示任务处理器模板解决方案的 Visual Studio 解决方案资源管理器

框架文件Framework files

  • Configuration.cs:封装作业配置数据的加载,例如 Batch 帐户详细信息、链接的存储帐户凭据、作业和任务信息,以及作业参数。Configuration.cs: Encapsulates the loading of job configuration data such as Batch account details, linked storage account credentials, job and task information, and job parameters. 它还通过 Configuration.EnvironmentVariable 类提供 Batch 定义的环境变量(请参阅 Batch 文档中“Environment settings for tasks”(任务的环境设置))的访问权限。It also provides access to Batch-defined environment variables (see Environment settings for tasks, in the Batch documentation) via the Configuration.EnvironmentVariable class.
  • IConfiguration.cs:抽象化配置类的实现,以便可以使用虚构或模拟的配置对象对作业拆分器进行单元测试。IConfiguration.cs: Abstracts the implementation of the Configuration class, so that you can unit test your job splitter using a fake or mock configuration object.
  • TaskProcessorException.cs:代表需要由作业管理器终止的错误。TaskProcessorException.cs: Represents an error that requires the job manager to terminate. TaskProcessorException 用于包装可在终止过程中提供特定诊断信息的“预期”错误。TaskProcessorException is used to wrap 'expected' errors where specific diagnostic information can be provided as part of termination.

任务处理器Task Processor

  • TaskProcessor.cs:运行任务。TaskProcessor.cs: Runs the task. 框架调用 TaskProcessor.Run 方法。The framework invokes the TaskProcessor.Run method. 这是会在其中注入任务的应用程序特定逻辑的类。This is the class where you will inject the application-specific logic of your task. 实现 Run 方法以便:Implement the Run method to:
    • 分析和验证任何任务参数Parse and validate any task parameters
    • 针对要调用的任何外部程序编写命令行Compose the command line for any external program you want to invoke
    • 记录为了调试所可能需要的任何诊断信息Log any diagnostic information you may require for debugging purposes
    • 使用该命令行启动进程Start a process using that command line
    • 等待进程退出Wait for the process to exit
    • 捕获进程的退出代码以确定其成功还是失败Capture the exit code of the process to determine if it succeeded or failed
    • 保存所有想要保留在持久性存储中的输出文件Save any output files you want to keep to persistent storage

标准 .NET 命令行项目文件Standard .NET command line project files

  • App.config:标准的 .NET 应用程序配置文件。App.config: Standard .NET application configuration file.
  • Packages.config:标准的 NuGet 包依赖项文件。Packages.config: Standard NuGet package dependency file.
  • Program.cs:包含程序入口点和顶层异常处理。Program.cs: Contains the program entry point and top-level exception handling.

实现任务处理器Implementing the task processor

当打开任务处理器模板项目时,项目默认情况下打开 TaskProcessor.cs 文件。When you open the Task Processor template project, the project will have the TaskProcessor.cs file open by default. 可按如下所示使用 Run() 方法为工作负荷中的任务实现运行逻辑:You can implement the run logic for the tasks in your workload by using the Run() method shown below:

/// <summary>
/// Runs the task processing logic. This is where you inject
/// your application-specific logic for decomposing the job into tasks.
///
/// The task processor framework invokes the Run method for you; you need
/// only to implement it, not to call it yourself. Typically, your
/// implementation will execute an external program (from resource files or
/// an application package), check the exit code of that program and
/// save output files to persistent storage.
/// </summary>
public async Task<int> Run()

{
    try
    {
        //Your code for the task processor goes here.
        var command = $"compare {_parameters["Frame1"]} {_parameters["Frame2"]} compare.gif";
        using (var process = Process.Start($"cmd /c {command}"))
        {
            process.WaitForExit();
            var taskOutputStorage = new TaskOutputStorage(
            _configuration.StorageAccount,
            _configuration.JobId,
            _configuration.TaskId
            );
            await taskOutputStorage.SaveAsync(
            TaskOutputKind.TaskOutput,
            @"..\stdout.txt",
            @"stdout.txt"
            );
            return process.ExitCode;
        }
    }
    catch (Exception ex)
    {
        throw new TaskProcessorException(
        $"{ex.GetType().Name} exception in run task processor: {ex.Message}",
        ex
        );
    }
}

备注

Run() 方法中的批注部分是任务处理器模板代码中唯一可修改的部分,方法是为工作负荷中的任务添加运行逻辑。The annotated section in the Run() method is the only section of the Task Processor template code that is intended for you to modify by adding the run logic for the tasks in your workload. 如果想要修改模板的其他部分,请先熟悉 Batch 的工作原理,方法是查看 Batch 文档并在几个 Batch 代码示例上进行尝试。If you want to modify a different section of the template, please first familiarize yourself with how Batch works by reviewing the Batch documentation and trying out a few of the Batch code samples.

Run() 方法负责启动命令行、启动一个或多个进程、等待所有进程完成、保存结果,最后返回退出代码。The Run() method is responsible for launching the command line, starting one or more processes, waiting for all process to complete, saving the results, and finally returning with an exit code. Run() 方法可供实现任务的处理逻辑。The Run() method is where you implement the processing logic for your tasks. 任务处理器框架调用 Run() 方法;用户不需要自行调用。The task processor framework invokes the Run() method for you; you do not need to call it yourself.

Run() 实现具有以下项的访问权限:Your Run() implementation has access to:

  • 任务参数,通过 _parameters 字段。The task parameters, via the _parameters field.
  • 作业和任务 ID,通过 _jobId_taskId 字段。The job and task ids, via the _jobId and _taskId fields.
  • 任务配置,通过 _configuration 字段。The task configuration, via the _configuration field.

任务失败Task failure

如果发生失败,可以引发异常以退出 Run() 方法,但这会导致顶层异常处理程序继续控制任务退出代码。In case of failure, you can exit the Run() method by throwing an exception, but this leaves the top level exception handler in control of the task exit code. 如果需要控制退出代码以便分辨不同类型的失败,例如为了进行诊断或由于某些失败模式应终止作业,某些则不应该,则应该通过返回非零退出代码来退出 Run() 方法。If you need to control the exit code so that you can distinguish different types of failure, for example for diagnostic purposes or because some failure modes should terminate the job and others should not, then you should exit the Run() method by returning a non-zero exit code. 这会成为任务退出代码。This becomes the task exit code.

任务处理器模板中的退出代码和异常Exit codes and exceptions in the Task Processor template

退出代码和异常提供了机制来确定程序的运行结果,可帮助找到任何程序执行问题。Exit codes and exceptions provide a mechanism to determine the outcome of running a program, and they can help identify any problems with the execution of the program. 任务处理器模板实现本部分所述的退出代码和异常。The Task Processor template implements the exit codes and exceptions described in this section.

使用任务处理器模板实现的任务处理器任务返回三个可能的退出代码:A task processor task that is implemented with the Task Processor template can return three possible exit codes:

代码Code 说明Description
Process.ExitCodeProcess.ExitCode 任务处理器已运行完成。The task processor ran to completion. 请注意,这并不表示调用的程序已成功,只表示任务处理器已成功调用程序并执行了所有后期处理步骤,而没有异常。Note that this does not imply that the program you invoked was successful - only that the task processor invoked it successfully and performed any post-processing without exceptions. 退出代码的含义取决于所调用的程序,一般而言,退出代码 0 表示程序已成功,任何其他退出代码都表示程序失败。The meaning of the exit code depends on the invoked program - typically exit code 0 means the program succeeded and any other exit code means the program failed.
11 任务处理器任务失败,程序的“预期”部分有异常。The task processor failed with an exception in an 'expected' part of the program. 异常已转换成 TaskProcessorException 与诊断信息,如有可能,还提供可解决失败的建议。The exception was translated to a TaskProcessorException with diagnostic information and, where possible, suggestions for resolving the failure.
22 任务处理器任务失败,发生“意外的”异常。The task processor failed with an 'unexpected' exception. 异常已记录到标准输出,但任务处理器无法添加任何额外的诊断或补救信息。The exception was logged to standard output, but the task processor was unable to add any additional diagnostic or remediation information.

备注

如果调用的程序使用退出代码 1 和 2 来指出特定失败模式,则使用退出代码 1 和 2 来代表任务处理器错误将造成模棱两可的状况。If the program you invoke uses exit codes 1 and 2 to indicate specific failure modes, then using exit codes 1 and 2 for task processor errors is ambiguous. 可以编辑 Program.cs 文件中的异常案例,将这些任务处理器错误代码更改为可区分的退出代码。You can change these task processor error codes to distinctive exit codes by editing the exception cases in the Program.cs file.

异常返回的所有信息已写入 stdout.txt 和 stderr.txt 文件。All the information returned by exceptions is written into stdout.txt and stderr.txt files. 有关详细信息,请参阅 Batch 文档中的“Error Handling”(错误处理)。For more information, see Error Handling, in the Batch documentation.

客户端注意事项Client considerations

存储凭据Storage credentials

如果任务处理器使用 Azure Blob 存储来保存输出,例如使用文件约定帮助器库,则它需要访问云存储帐户凭据或包含共享访问签名 (SAS) 的 Blob 容器 URL。If your task processor uses Azure blob storage to persist outputs, for example using the file conventions helper library, then it needs access to either the cloud storage account credentials or a blob container URL that includes a shared access signature (SAS). 模板支持通过通用环境变量来提供凭据。The template includes support for providing credentials via common environment variables. 客户端可按如下所示传递存储凭据:Your client can pass the storage credentials as follows:

job.CommonEnvironmentSettings = new [] {
    new EnvironmentSetting("LINKED_STORAGE_ACCOUNT", "{storageAccountName}"),
    new EnvironmentSetting("LINKED_STORAGE_KEY", "{storageAccountKey}"),
};

然后,可以通过 _configuration.StorageAccount 属性在 TaskProcessor 类中使用存储帐户。The storage account is then available in the TaskProcessor class via the _configuration.StorageAccount property.

如果想要使用具有 SAS 的容器 URL,也可以通过作业的通用环境设置传递此 URL,但任务处理器模板目前未内置支持此 URL。If you prefer to use a container URL with SAS, you can also pass this via an job common environment setting, but the task processor template does not currently include built-in support for this.

存储设置Storage setup

建议客户端或作业管理器任务先创建任务所需的任何容器,再将任务添加到作业。It is recommended that the client or job manager task create any containers required by tasks before adding the tasks to the job. 如果使用具有 SAS 的容器 URL 就必须这样做,因为这样的 URL 并未包含创建容器的权限。This is mandatory if you use a container URL with SAS, as such a URL does not include permission to create the container. 即使传递的是存储帐户凭据仍建议这样做,因为它存储每一项必须在容器上调用 CloudBlobContainer.CreateIfNotExistsAsync 的任务。It is recommended even if you pass storage account credentials, as it saves every task having to call CloudBlobContainer.CreateIfNotExistsAsync on the container.

传递参数和环境变量Pass parameters and environment variables

传递环境设置Pass environment settings

客户端可以环境设置的形式将信息传递给作业管理器任务。A client can pass information to the job manager task in the form of environment settings. 然后,作业管理器任务可在生成作为计算作业一部分来运行的任务处理器任务时使用此信息。This information can then be used by the job manager task when generating the task processor tasks that will run as part of the compute job. 可以环境设置形式传递的信息示例如下:Examples of the information that you can pass as environment settings are:

  • 存储帐户名称和帐户密钥Storage account name and account keys
  • Batch 帐户 URLBatch account URL
  • Batch 帐户密钥Batch account key

Batch 服务提供一个简单的机制,用于在 Microsoft.Azure.Batch.JobManagerTask 中使用 EnvironmentSettings 属性将环境设置传递到作业管理器任务。The Batch service has a simple mechanism to pass environment settings to a job manager task by using the EnvironmentSettings property in Microsoft.Azure.Batch.JobManagerTask.

例如,若要获取 Batch 帐户的 BatchClient 实例,可以环境变量的形式从客户端代码传递 Batch 帐户的 URL 和共享密钥凭据。For example, to get the BatchClient instance for a Batch account, you can pass as environment variables from the client code the URL and shared key credentials for the Batch account. 同样,若要访问链接到 Batch 帐户的存储帐户,可使用环境变量的形式传递存储帐户名和存储帐户密钥。Likewise, to access the storage account that is linked to the Batch account, you can pass the storage account name and the storage account key as environment variables.

将参数传递到作业管理器模板Pass parameters to the Job Manager template

在许多情况下,最好将每个操作的参数传递到作业管理器任务,以便控制作业拆分进程或配置作业的任务。In many cases, it's useful to pass per-job parameters to the job manager task, either to control the job splitting process or to configure the tasks for the job. 为此,可将名为 parameters.json 的 JSON 文件上传为作业管理器任务的资源文件。You can do this by uploading a JSON file named parameters.json as a resource file for the job manager task. 然后,参数就可以在作业管理器模板的 JobSplitter._parameters 字段中可用。The parameters can then become available in the JobSplitter._parameters field in the Job Manager template.

备注

内置的参数处理程序只支持字符串到字符串的字典。The built-in parameter handler supports only string-to-string dictionaries. 如果想要以参数值的形式传递复杂 JSON 值,需要以字符串形式传递并在作业拆分器中进行分析,或者修改框架的 Configuration.GetJobParameters 方法。If you want to pass complex JSON values as parameter values, you will need to pass these as strings and parse them in the job splitter, or modify the framework's Configuration.GetJobParameters method.

将参数传递给任务处理器模板Pass parameters to the Task Processor template

也可以使用任务处理器模板将参数传递到所实现的各个任务。You can also pass parameters to individual tasks implemented using the Task Processor template. 就像使用作业管理器模板一样,任务处理器模板查找名为Just as with the job manager template, the task processor template looks for a resource file named

parameters.json 的资源文件,如果找到,则将它加载为参数字典。parameters.json, and if found it loads it as the parameters dictionary. 有几个选项可用于将参数传递给任务处理器任务:There are a couple of options for how to pass parameters to the task processor tasks:

  • 重复使用作业参数 JSON。Reuse the job parameters JSON. 这适用于唯一的参数都是作业范围的参数时(例如渲染高度和宽度)。This works well if the only parameters are job-wide ones (for example, a render height and width). 为此,请于在作业拆分器中创建 CloudTask 时,从作业管理器任务的 ResourceFiles (JobSplitter._jobManagerTask.ResourceFiles) 将 parameters.json 资源文件对象的引用添加到 CloudTask 的 ResourceFiles 集合。To do this, when creating a CloudTask in the job splitter, add a reference to the parameters.json resource file object from the job manager task's ResourceFiles (JobSplitter._jobManagerTask.ResourceFiles) to the CloudTask's ResourceFiles collection.
  • 生成和上传任务特定的 parameters.json 文档作为作业拆分器执行的一部分,并在任务的资源文件集合中引用该 Blob。Generate and upload a task-specific parameters.json document as part of job splitter execution, and reference that blob in the task's resource files collection. 如果不同的任务有不同的参数,就必须这样做。This is necessary if different tasks have different parameters. 以参数形式将帧索引传递到任务的 3D 渲染方案便是可能的示例。An example might be a 3D rendering scenario where the frame index is passed to the task as a parameter.

备注

内置的参数处理程序只支持字符串到字符串的字典。The built-in parameter handler supports only string-to-string dictionaries. 如果想要以参数值的形式传递复杂 JSON 值,需要以字符串形式传递并在任务处理器中进行分析,或者修改框架的 Configuration.GetTaskParameters 方法。If you want to pass complex JSON values as parameter values, you will need to pass these as strings and parse them in the task processor, or modify the framework's Configuration.GetTaskParameters method.

后续步骤Next steps

将作业和任务输出保存到 Azure 存储Persist job and task output to Azure Storage

在开发 Batch 解决方案时的另一个有用工具是 Azure Batch 文件约定Another helpful tool in Batch solution development is Azure Batch File Conventions. 在 Batch .NET 应用程序中使用此 .NET 类库(目前以预览版提供)可在 Azure 存储中轻松存储和检索任务输出。Use this .NET class library (currently in preview) in your Batch .NET applications to easily store and retrieve task outputs to and from Azure Storage. 保存 Azure Batch 作业和任务输出包含该库及其用法的完整介绍。Persist Azure Batch job and task output contains a full discussion of the library and its usage.