Durable Functions 的零停机时间部署Zero-downtime deployment for Durable Functions

Durable Functions 的可靠执行模型要求业务流程是确定性的,这就需要在部署更新时考虑到更多的挑战。The reliable execution model of Durable Functions requires that orchestrations be deterministic, which creates an additional challenge to consider when you deploy updates. 如果部署包含对活动函数签名或业务流程协调程序逻辑的更改,运行中的业务流程实例将发生故障。When a deployment contains changes to activity function signatures or orchestrator logic, in-flight orchestration instances fail. 对于长时间运行的业务流程实例(要运行数小时甚至数天),此问题尤为严重。This situation is especially a problem for instances of long-running orchestrations, which might represent hours or days of work.

为了防止发生这些故障,可以执行两个选项:To prevent these failures from happening, you have two options:

  • 延迟部署,直到所有正在运行的业务流程实例都已完成。Delay your deployment until all running orchestration instances have completed.
  • 确保任何正在运行的业务流程实例使用现有的函数版本。Make sure that any running orchestration instances use the existing versions of your functions.

下图比较了实现 Durable Functions 零停机时间部署的三个主要策略:The following chart compares the three main strategies to achieve a zero-downtime deployment for Durable Functions:

策略Strategy 何时使用When to use 优点Pros 缺点Cons
版本控制Versioning 不经常出现中断性变更的应用程序。Applications that don't experience frequent breaking changes. 易于实施。Simple to implement. 函数应用的内存大小和函数数目增加。Increased function app size in memory and number of functions.
代码复制。Code duplication.
带槽的状态检查Status check with slot 不存在长时间运行(超过 24 小时)的业务流程或经常重叠业务流程的系统。A system that doesn't have long-running orchestrations lasting more than 24 hours or frequently overlapping orchestrations. 简单的基本代码。Simple code base.
不需要额外的函数应用管理。Doesn't require additional function app management.
需要额外的存储帐户或任务中心管理。Requires additional storage account or task hub management.
需要有几段时间没有任何业务流程运行。Requires periods of time when no orchestrations are running.
应用程序路由Application routing 在所有时间段都有业务流程运行的系统,例如,业务流程持续时间超过 24 小时的时段,或经常重叠业务流程的时段。A system that doesn't have periods of time when orchestrations aren't running, such as those time periods with orchestrations that last more than 24 hours or with frequently overlapping orchestrations. 处理持续运行具有中断性变更的业务流程的新系统版本。Handles new versions of systems with continually running orchestrations that have breaking changes. 需要智能应用程序路由器。Requires an intelligent application router.
可能超出订阅允许的最大函数应用数目。Could max out the number of function apps allowed by your subscription. 默认值为 100。The default is 100.

版本控制Versioning

定义新版本函数,并在函数应用中保留旧版本。Define new versions of your functions and leave the old versions in your function app. 如图中所示,函数的版本将成为其名称的一部分。As you can see in the diagram, a function's version becomes part of its name. 由于保留了以前的函数版本,运行中的业务流程实例可以继续引用它们。Because previous versions of functions are preserved, in-flight orchestration instances can continue to reference them. 同时,对新业务流程实例的请求将调用最新版本,业务流程客户端函数可以从应用设置引用该版本。Meanwhile, requests for new orchestration instances call for the latest version, which your orchestration client function can reference from an app setting.

策略版本控制

在此策略中,必须复制每个函数,并且必须更新其对其他函数的引用。In this strategy, every function must be copied, and its references to other functions must be updated. 可以通过编写脚本来简化此过程。You can make it easier by writing a script. 下面是使用迁移脚本的示例项目Here's a sample project with a migration script.

备注

此策略使用部署槽来避免部署期间发生停机。This strategy uses deployment slots to avoid downtime during deployment. 有关如何创建和使用新部署槽的详细信息,请参阅 Azure Functions 部署槽For more detailed information about how to create and use new deployment slots, see Azure Functions deployment slots.

带槽的状态检查Status check with slot

当前版本的函数应用在生产槽中运行时,将新版本的函数应用部署到过渡槽。While the current version of your function app is running in your production slot, deploy the new version of your function app to your staging slot. 在交换生产槽和过渡槽之前,检查是否有任何正在运行的业务流程实例。Before you swap your production and staging slots, check to see if there are any running orchestration instances. 所有业务流程实例完成后,可以执行交换。After all orchestration instances are complete, you can do the swap. 如果在可预测的时间段没有正在运行的业务流程实例,则此策略适用。This strategy works when you have predictable periods when no orchestration instances are in flight. 如果业务流程不会长时间运行,并且业务流程执行不经常重叠,则这是最佳方法。This is the best approach when your orchestrations aren't long-running and when your orchestration executions don't frequently overlap.

函数应用配置Function app configuration

请使用以下过程来设置此方案。Use the following procedure to set up this scenario.

  1. 将部署槽添加到函数应用用于过渡和生产。Add deployment slots to your function app for staging and production.

  2. 对于每个槽,请将 AzureWebJobsStorage 应用程序设置指定为共享存储帐户的连接字符串。For each slot, set the AzureWebJobsStorage application setting to the connection string of a shared storage account. 此存储帐户连接字符串由 Azure Functions 运行时使用。This storage account connection string is used by the Azure Functions runtime. 此帐户由 Azure Functions 运行时使用,可以管理函数的密钥。This account is used by the Azure Functions runtime and manages the function's keys.

  3. 对于每个槽,请创建新的应用设置,例如 DurableManagementStorageFor each slot, create a new app setting, for example, DurableManagementStorage. 将其值设置为不同存储帐户的连接字符串。Set its value to the connection string of different storage accounts. Durable Functions 扩展使用这些存储帐户来实现可靠执行These storage accounts are used by the Durable Functions extension for reliable execution. 对每个槽使用单独的存储帐户。Use a separate storage account for each slot. 不要将此设置标记为部署槽设置。Don't mark this setting as a deployment slot setting.

  4. 在函数应用的 host.json 文件的 durableTask 节中,将 azureStorageConnectionStringName 指定为在步骤 3 中创建的应用设置的名称。In your function app's host.json file's durableTask section, specify azureStorageConnectionStringName as the name of the app setting you created in step 3.

下图显示了部署槽和存储帐户的所述配置。The following diagram shows the described configuration of deployment slots and storage accounts. 在这种可能的部署前方案中,函数应用版本 2 在生产槽中运行,而版本 1 保留在过渡槽中。In this potential predeployment scenario, version 2 of a function app is running in the production slot, while version 1 remains in the staging slot.

部署槽位和存储帐户

host json 示例host.json examples

以下 JSON 片段是 host.json 文件中的连接字符串设置示例。The following JSON fragments are examples of the connection string setting in the host.json file.

Functions 2.0Functions 2.0

{
  "version": 2.0,
  "extensions": {
    "durableTask": {
      "azureStorageConnectionStringName": "DurableManagementStorage"
    }
  }
}

Functions 1.xFunctions 1.x

{
  "durableTask": {
    "azureStorageConnectionStringName": "DurableManagementStorage"
  }
}

CI/CD 管道配置CI/CD pipeline configuration

将 CI/CD 管道配置为仅当函数应用不存在任何挂起或正在运行的业务流程实例时进行部署。Configure your CI/CD pipeline to deploy only when your function app has no pending or running orchestration instances. 使用 Azure Pipelines 时,可以创建一个函数用于检查这些条件,如以下示例中所示:When you're using Azure Pipelines, you can create a function that checks for these conditions, as in the following example:

[FunctionName("StatusCheck")]
public static async Task<IActionResult> StatusCheck(
    [HttpTrigger(AuthorizationLevel.Function, "get", "post")] HttpRequestMessage req,
    [DurableClient] IDurableOrchestrationClient client,
    ILogger log)
{
    var runtimeStatus = new List<OrchestrationRuntimeStatus>();

    runtimeStatus.Add(OrchestrationRuntimeStatus.Pending);
    runtimeStatus.Add(OrchestrationRuntimeStatus.Running);

    var result = await client.ListInstancesAsync(new OrchestrationStatusQueryCondition() { RuntimeStatus = runtimeStatus }, CancellationToken.None);
    return (ActionResult)new OkObjectResult(new { HasRunning = result.DurableOrchestrationState.Any() });
}

接下来,将过渡门限配置为等到没有任何业务流程运行为止。Next, configure the staging gate to wait until no orchestrations are running.

部署入口

Azure Pipelines 会在部署开始之前检查函数应用是否存在正在运行的业务流程实例。Azure Pipelines checks your function app for running orchestration instances before your deployment starts.

部署门限(正在运行)

现在,新版本的函数应用应会部署到过渡槽。Now the new version of your function app should be deployed to the staging slot.

过渡槽

最后交换槽。Finally, swap slots.

未标记为部署槽设置的应用程序设置也会交换,因此,版本 2 应用将保留它对存储帐户 A 的引用。由于业务流程状态是在存储帐户中跟踪的,因此,在版本 2 应用中运行的任何业务流程将继续在新槽中运行,而不会中断。Application settings that aren't marked as deployment slot settings are also swapped, so the version 2 app keeps its reference to storage account A. Because orchestration state is tracked in the storage account, any orchestrations running on the version 2 app continue to run in the new slot without interruption.

部署槽位

若要对两个槽使用同一个存储帐户,可以更改任务中心的名称。To use the same storage account for both slots, you can change the names of your task hubs. 在这种情况下,需要管理槽的状态和应用的 HubName 设置。In this case, you need to manage the state of your slots and your app's HubName settings. 有关详细信息,请参阅 Durable Functions中的任务中心To learn more, see Task hubs in Durable Functions.

应用程序路由Application routing

此策略最复杂。This strategy is the most complex. 但是,它可用于在运行的业务流程之间没有时间间隔的函数应用。However, it can be used for function apps that don't have time between running orchestrations.

对于此策略,必须在 Durable Functions 的前面创建一个应用程序路由器。**For this strategy, you must create an application router in front of your Durable Functions. 此路由器可通过 Durable Functions 实现。This router can be implemented with Durable Functions. 路由器的责任如下:The router has the responsibility to:

  • 部署函数应用。Deploy the function app.
  • 管理 Durable Functions 的版本。Manage the version of Durable Functions.
  • 将业务流程请求路由到函数应用。Route orchestration requests to function apps.

首次收到业务流程请求时,该路由器执行以下任务:The first time an orchestration request is received, the router does the following tasks:

  1. 在 Azure 中创建新的函数应用。Creates a new function app in Azure.
  2. 将函数应用的代码部署到 Azure 中的新函数应用。Deploys your function app's code to the new function app in Azure.
  3. 将业务流程请求转发到新应用。Forwards the orchestration request to the new app.

该路由器管理以下状态:将哪个版本的应用代码部署到 Azure 中的哪个函数应用。The router manages the state of which version of your app's code is deployed to which function app in Azure.

应用程序路由(首次)

该路由器根据连同请求一起发送的版本,将部署和业务流程请求定向到相应的函数应用。The router directs deployment and orchestration requests to the appropriate function app based on the version sent with the request. 它会忽略修补程序版本。It ignores the patch version.

部署没有中断性变更的新应用版本时,可以递增修补程序版本。When you deploy a new version of your app without a breaking change, you can increment the patch version. 该路由器将部署到现有的函数应用,并将对旧版和新版代码的请求发送到同一个函数应用。The router deploys to your existing function app and sends requests for the old and new versions of the code, which are routed to the same function app.

应用程序路由(无中断性变更)

部署具有中断性变更的新应用版本时,可以递增主要版本或次要版本。When you deploy a new version of your app with a breaking change, you can increment the major or minor version. 然后,应用程序路由器将在 Azure 中创建新的函数应用,部署到该应用,并将对新版应用的请求路由到该应用。Then the application router creates a new function app in Azure, deploys to it, and routes requests for the new version of your app to it. 在下图中,在 1.0.1 版应用中运行的业务流程将保持运行,但对 1.1.0 版本的请求将路由到新的函数应用。In the following diagram, running orchestrations on the 1.0.1 version of the app keep running, but requests for the 1.1.0 version are routed to the new function app.

应用程序路由(中断性变更)

该路由器将监视版本 1.0.1 中的业务流程的状态,并在完成所有业务流程后删除应用。The router monitors the status of orchestrations on the 1.0.1 version and removes apps after all orchestrations are finished.

跟踪存储设置Tracking store settings

每个函数应用应使用单独的计划队列(这些队列可能位于不同的存储帐户中)。Each function app should use separate scheduling queues, possibly in separate storage accounts. 若要跨应用程序的所有版本查询所有业务流程实例,可以在函数应用之间共享实例和历史记录表。If you want to query all orchestrations instances across all versions of your application, you can share instance and history tables across your function apps. 可以通过在 host.json 设置文件中配置 trackingStoreConnectionStringNametrackingStoreNamePrefix 设置来共享表,使它们全部使用相同的值。You can share tables by configuring the trackingStoreConnectionStringName and trackingStoreNamePrefix settings in the host.json settings file so that they all use the same values.

有关详细信息,请参阅在 Azure 中管理 Durable Functions 中的实例For more information, see Manage instances in Durable Functions in Azure.

跟踪存储设置

后续步骤Next steps