Durable Functions 中的版本控制 (Azure Functions)Versioning in Durable Functions (Azure Functions)

在应用程序的生存期内添加、删除和更改函数是不可避免的。It is inevitable that functions will be added, removed, and changed over the lifetime of an application. Durable Functions 允许以之前不可能的方式将函数链接在一起,且此链接将影响处理版本控制的方式。Durable Functions allows chaining functions together in ways that weren't previously possible, and this chaining affects how you can handle versioning.

如何处理重大更改How to handle breaking changes

需要知道一些重大更改示例。There are several examples of breaking changes to be aware of. 本文介绍最常见的示例。This article discusses the most common ones. 所有这些示例背后的主题都是,新的和现有的函数业务流程均受函数代码更改所影响。The main theme behind all of them is that both new and existing function orchestrations are impacted by changes to function code.

更改活动或实体函数签名Changing activity or entity function signatures

签名更改涉及到函数名称、输入或输出的更改。A signature change refers to a change in the name, input, or output of a function. 如果对活动或实体函数进行了此类更改,则可能会中断依赖于它的任何业务流程协调程序函数。If this kind of change is made to an activity or entity function, it could break any orchestrator function that depends on it. 如果更新业务流程协调程序函数以适应此更改,则可能会中断现有的正在进行的实例。If you update the orchestrator function to accommodate this change, you could break existing in-flight instances.

例如,假设有以下业务流程协调程序函数。As an example, suppose we have the following orchestrator function.

[FunctionName("FooBar")]
public static Task Run([OrchestrationTrigger] IDurableOrchestrationContext context)
{
    bool result = await context.CallActivityAsync<bool>("Foo");
    await context.CallActivityAsync("Bar", result);
}

此简单函数接收“Foo” 结果并将其传递给“Bar” 。This simplistic function takes the results of Foo and passes it to Bar. 假设需要将返回值“Foo” 从 bool 更改为 int 以支持各种不同的结果值。Let's assume we need to change the return value of Foo from bool to int to support a wider variety of result values. 结果如下所示:The result looks like this:

[FunctionName("FooBar")]
public static Task Run([OrchestrationTrigger] IDurableOrchestrationContext context)
{
    int result = await context.CallActivityAsync<int>("Foo");
    await context.CallActivityAsync("Bar", result);
}

备注

前面的 C# 示例以 Durable Functions 2.x 为目标。The previous C# examples target Durable Functions 2.x. 对于 Durable Functions 1.x,必须使用 DurableOrchestrationContext 而不是 IDurableOrchestrationContextFor Durable Functions 1.x, you must use DurableOrchestrationContext instead of IDurableOrchestrationContext. 有关版本之间差异的详细信息,请参阅 Durable Functions 版本一文。For more information about the differences between versions, see the Durable Functions versions article.

此更改会应用于业务流程协调程序函数的所有新实例,但会中断任何正在进行的实例。This change works fine for all new instances of the orchestrator function but breaks any in-flight instances. 例如,请考虑这样一种情况:业务流程实例调用名为 Foo 的函数并先后返回布尔值和检查点。For example, consider the case where an orchestration instance calls a function named Foo, gets back a boolean value, and then checkpoints. 如果此时部署签名更改,则检查点实例在恢复和重播对 context.CallActivityAsync<int>("Foo") 的调用时将立即失败。If the signature change is deployed at this point, the checkpointed instance will fail immediately when it resumes and replays the call to context.CallActivityAsync<int>("Foo"). 发生此故障是因为历史记录表中的结果为 bool,而新代码尝试将其反序列化为 intThis failure happens because the result in the history table is bool but the new code tries to deserialize it into int.

这只是签名更改可以中断现有实例的许多不同示例之一。This example is just one of many different ways that a signature change can break existing instances. 一般情况下,如果业务流程协调程序需要更改其调用函数的方式,则此更改可能会出现问题。In general, if an orchestrator needs to change the way it calls a function, then the change is likely to be problematic.

更改业务流程协调程序逻辑Changing orchestrator logic

另一类版本控制问题来自于,以与正在进行的实例的重播逻辑混淆的方式更改业务流程协调程序函数代码。The other class of versioning problems come from changing the orchestrator function code in a way that confuses the replay logic for in-flight instances.

请考虑以下业务流程协调程序函数:Consider the following orchestrator function:

[FunctionName("FooBar")]
public static Task Run([OrchestrationTrigger] IDurableOrchestrationContext context)
{
    bool result = await context.CallActivityAsync<bool>("Foo");
    await context.CallActivityAsync("Bar", result);
}

现在,假设你想要进行某一项看似无害的更改以添加另一个函数调用。Now let's assume you want to make a seemingly innocent change to add another function call.

[FunctionName("FooBar")]
public static Task Run([OrchestrationTrigger] IDurableOrchestrationContext context)
{
    bool result = await context.CallActivityAsync<bool>("Foo");
    if (result)
    {
        await context.CallActivityAsync("SendNotification");
    }

    await context.CallActivityAsync("Bar", result);
}

备注

前面的 C# 示例以 Durable Functions 2.x 为目标。The previous C# examples target Durable Functions 2.x. 对于 Durable Functions 1.x,必须使用 DurableOrchestrationContext 而不是 IDurableOrchestrationContextFor Durable Functions 1.x, you must use DurableOrchestrationContext instead of IDurableOrchestrationContext. 有关版本之间差异的详细信息,请参阅 Durable Functions 版本一文。For more information about the differences between versions, see the Durable Functions versions article.

此更改会向“Foo” 和“Bar” 之间的“SendNotification” 添加新的函数调用。This change adds a new function call to SendNotification between Foo and Bar. 不存在签名更改。There are no signature changes. 在从对“Bar” 的调用中恢复现有实例时,将出现问题。The problem arises when an existing instance resumes from the call to Bar. 在重播过程中,如果对“Foo” 的原始调用返回 true,则业务流程协调程序重播将调用不在其执行历史记录中的“SendNotification” 。During replay, if the original call to Foo returned true, then the orchestrator replay will call into SendNotification, which is not in its execution history. 因此,Durable Task Framework 将失败,并且出现 NonDeterministicOrchestrationException,因为在它希望出现对“Bar” 的调用时出现了对“SendNotification” 的调用。As a result, the Durable Task Framework fails with a NonDeterministicOrchestrationException because it encountered a call to SendNotification when it expected to see a call to Bar. 添加对“durable”API 的任何调用(包括 CreateTimerWaitForExternalEvent 等)时可能会出现相同类型的问题。The same type of problem can occur when adding any calls to "durable" APIs, including CreateTimer, WaitForExternalEvent, etc.

缓解策略Mitigation strategies

以下是一些用于处理版本控制问题的策略:Here are some of the strategies for dealing with versioning challenges:

  • 不执行任何操作Do nothing
  • 停止所有正在进行的实例Stop all in-flight instances
  • 并行部署Side-by-side deployments

不执行任何操作Do nothing

处理中断更改的最简单方法是使正在进行的业务流程实例失败。The easiest way to handle a breaking change is to let in-flight orchestration instances fail. 新实例成功运行更改后的代码。New instances successfully run the changed code.

这种类型的故障是否是一个问题取决于正在进行的实例的重要性。Whether this kind of failure is a problem depends on the importance of your in-flight instances. 如果你处于积极开发状态,且不关心正在进行的实例,那么这可能已足够。If you are in active development and don't care about in-flight instances, this might be good enough. 但是,你需要处理诊断管道中的异常和错误。However, you'll need to deal with exceptions and errors in your diagnostics pipeline. 如果想要避免这些事情,请考虑使用其他版本控制选项。If you want to avoid those things, consider the other versioning options.

停止所有正在进行的实例Stop all in-flight instances

另一个选项是停止所有正在进行的实例。Another option is to stop all in-flight instances. 可通过清除内部“control-queue” 和“workitem-queue” 队列的内容来停止所有实例。Stopping all instances can be done by clearing the contents of the internal control-queue and workitem-queue queues. 实例将永远停滞在其当前位置,但它们不会导致日志中布满失败消息。The instances will be forever stuck where they are, but they will not clutter your logs with failure messages. 这是适用于快速原型开发的理想方法。This approach is ideal in rapid prototype development.

警告

这些队列的详细信息可能会随时间而发生更改,因此对于生产工作负载,请不要依赖此方法。The details of these queues may change over time, so don't rely on this technique for production workloads.

并行部署Side-by-side deployments

确保安全部署中断更改的最万无一失的方法是将其与较旧版本进行并行部署。The most fail-proof way to ensure that breaking changes are deployed safely is by deploying them side-by-side with your older versions. 可使用以下任何方法来完成此操作:This can be done using any of the following techniques:

  • 将所有更新部署为全新的函数,保持现有函数不变。Deploy all the updates as entirely new functions, leaving existing functions as-is. 这可能很棘手,因为新函数版本的调用方也必须按照相同的准则进行更新。This can be tricky because the callers of the new function versions must be updated as well following the same guidelines.
  • 将所有更新部署为使用不同存储帐户的新函数应用。Deploy all the updates as a new function app with a different storage account.
  • 使用相同的存储帐户但使用更新的 taskHub 名称来部署函数应用的新副本。Deploy a new copy of the function app with the same storage account but with an updated taskHub name. 并行部署是建议的方法。Side-by-side deployments is the recommended technique.

如何更改任务中心名称How to change task hub name

可在 host.json 文件中配置任务中心,如下所示:The task hub can be configured in the host.json file as follows:

Functions 1.xFunctions 1.x

{
    "durableTask": {
        "hubName": "MyTaskHubV2"
    }
}

Functions 2.0Functions 2.0

{
    "extensions": {
        "durableTask": {
            "hubName": "MyTaskHubV2"
        }
    }
}

Durable Functions v1.x 的默认值为 DurableFunctionsHubThe default value for Durable Functions v1.x is DurableFunctionsHub. 从 Durable Functions v2.0 开始,默认任务中心名称与 Azure 中的函数应用名称相同,如果在 Azure 之外运行,则为 TestHubNameStarting in Durable Functions v2.0, the default task hub name is the same as the function app name in Azure, or TestHubName if running outside of Azure.

所有 Azure 存储实体都基于 hubName 配置值进行命名。All Azure Storage entities are named based on the hubName configuration value. 通过给任务中心提供新名称,可以确保为应用程序的新版本创建单独的队列和历史记录表。By giving the task hub a new name, you ensure that separate queues and history table are created for the new version of your application. 但是,函数应用将停止处理在前一任务中心名称下创建的业务流程或实体的事件。The function app, however, will stop processing events for orchestrations or entities created under the previous task hub name.

建议将函数应用的新版本部署到一个新的部署槽位We recommend that you deploy the new version of the function app to a new Deployment Slot. 通过部署槽位,可以并行运行函数应用的多个副本,且仅其中一个槽位为活动生产槽位 。Deployment slots allow you to run multiple copies of your function app side-by-side with only one of them as the active production slot. 当准备好向现有基础结构公开新业务流程逻辑时,它可以像将新版本交换到生产槽一样简单。When you are ready to expose the new orchestration logic to your existing infrastructure, it can be as simple as swapping the new version into the production slot.

备注

在对业务流程协调程序函数使用 HTTP 和 webhook 触发器时,此策略效果最佳。This strategy works best when you use HTTP and webhook triggers for orchestrator functions. 对于非 HTTP 触发器(如队列或事件中心),触发器定义应派生自在交换操作过程中更新的应用设置For non-HTTP triggers, such as queues or Event Hubs, the trigger definition should derive from an app setting that gets updated as part of the swap operation.

后续步骤Next steps