优化 Azure Functions 的性能和可靠性Optimize the performance and reliability of Azure Functions

本文为提高无服务器函数应用的性能和可靠性提供了指南。This article provides guidance to improve the performance and reliability of your serverless function apps.

常规最佳做法General best practices

下面是有关如何使用 Azure Functions 生成和构建无服务器解决方案的最佳做法。The following are best practices in how you build and architect your serverless solutions using Azure Functions.

避免使用长时间运行的函数Avoid long running functions

长时间运行的大型函数可能会引起意外超时问题。Large, long-running functions can cause unexpected timeout issues. 若要详细了解给定托管计划的超时,请参阅函数应用超时持续时间To learn more about the timeouts for a given hosting plan, see function app timeout duration.

由于含有许多 Node.js 依赖项,函数规模可能会变得很大。A function can become large because of many Node.js dependencies. 导入依赖项也会导致加载时间增加,引起意外的超时问题。Importing dependencies can also cause increased load times that result in unexpected timeouts. 显式和隐式加载依赖项。Dependencies are loaded both explicitly and implicitly. 由代码加载的单个模块可能会加载自己的附加模块。A single module loaded by your code may load its own additional modules.

尽可能将大型函数重构为可协同工作且快速返回响应的较小函数集。Whenever possible, refactor large functions into smaller function sets that work together and return responses fast. 例如,webhook 或 HTTP 触发器函数可能需要在特定时间限制内确认响应;webhook 需要快速响应,这很常见。For example, a webhook or HTTP trigger function might require an acknowledgment response within a certain time limit; it's common for webhooks to require an immediate response. 可将 HTTP 触发器有效负载传递到由队列触发器函数处理的队列。You can pass the HTTP trigger payload into a queue to be processed by a queue trigger function. 此方法允许延迟实际工作并返回即时响应。This approach lets you defer the actual work and return an immediate response.

跨函数通信Cross function communication

Durable FunctionsAzure 逻辑应用用于管理状态转换以及多个函数之间的通信。Durable Functions and Azure Logic Apps are built to manage state transitions and communication between multiple functions.

如果不使用 Durable Functions 或逻辑应用来集成多个函数,则最好是将存储队列用于跨函数通信。If not using Durable Functions or Logic Apps to integrate with multiple functions, it's best to use storage queues for cross-function communication. 主要原因是与其他存储选项相比,存储队列成本更低且更易预配。The main reason is that storage queues are cheaper and much easier to provision than other storage options.

存储队列中各消息的大小限制为 64 KB。Individual messages in a storage queue are limited in size to 64 KB. 如果需要在函数之间传递更大的消息,则可使用 Azure 服务总线队列,支持标准层中最大 256 KB 的消息大小。If you need to pass larger messages between functions, an Azure Service Bus queue could be used to support message sizes up to 256 KB in the Standard tier.

如果在处理前需要筛选消息,则服务总线主题十分有用。Service Bus topics are useful if you require message filtering before processing.

对于支持大容量通信,事件中心十分有用。Event hubs are useful to support high volume communications.

将函数编写为无状态Write functions to be stateless

如有可能,函数应为无状态和幂等。Functions should be stateless and idempotent if possible. 将任何所需的状态信息与用户的数据相关联。Associate any required state information with your data. 例如,正在处理的排序可能具有关联的 state 成员。For example, an order being processed would likely have an associated state member. 函数本身保持无状态时,该函数可根据该状态处理排序。A function could process an order based on that state while the function itself remains stateless.

对于计时器触发器,特别建议采用幂等函数。Idempotent functions are especially recommended with timer triggers. 例如,如果有必须每天运行一次的内容,则编写它,使它可在一天内的任何时间运行,并生成相同的结果。For example, if you have something that absolutely must run once a day, write it so it can run anytime during the day with the same results. 某天没有任何工作时,可退出该函数。The function can exit when there's no work for a particular day. 此外,如果未能完成以前的运行,则下次运行应从中断的位置继续运行。Also if a previous run failed to complete, the next run should pick up where it left off.

编写防御函数Write defensive functions

假定任何时候函数都可能会遇到异常。Assume your function could encounter an exception at any time. 设计函数,使其具有在下次执行期间从上一失败点继续执行的能力。Design your functions with the ability to continue from a previous fail point during the next execution. 请考虑需执行以下操作的方案:Consider a scenario that requires the following actions:

  1. 在数据库中进行 10,000 行的查询。Query for 10,000 rows in a database.
  2. 为每行创建队列消息,从而处理下一行。Create a queue message for each of those rows to process further down the line.

根据系统复杂程度,可能有:行为有误的相关下游服务,网络故障或已达配额限制等等。所有这些可在任何时间影响用户的函数。Depending on how complex your system is, you may have: involved downstream services behaving badly, networking outages, or quota limits reached, etc. All of these can affect your function at any time. 需设计函数,使其做好该准备。You need to design your functions to be prepared for it.

如果将 5,000 个那些项插入到队列中进行处理,然后发生故障,代码将如何响应?How does your code react if a failure occurs after inserting 5,000 of those items into a queue for processing? 跟踪已完成的一组中的项。Track items in a set that you’ve completed. 否则,下次可能再次插入它们。Otherwise, you might insert them again next time. 这种双插入可能会严重影响工作流,因此请将函数设置为幂等This double-insertion can have a serious impact on your work flow, so make your functions idempotent.

如果已处理队列项,则允许函数不执行任何操作。If a queue item was already processed, allow your function to be a no-op.

利用已为 Azure Functions 平台中使用的组件提供的防御措施。Take advantage of defensive measures already provided for components you use in the Azure Functions platform. 有关示例,请参阅 Azure 存储队列触发器和绑定文档中的处理有害队列消息For example, see Handling poison queue messages in the documentation for Azure Storage Queue triggers and bindings.

可伸缩性最佳做法Scalability best practices

有许多因素会影响函数应用实例的缩放方式。There are a number of factors that impact how instances of your function app scale. 有关函数缩放的文档中提供了详细信息。The details are provided in the documentation for function scaling. 下面是确保以最佳方式缩放函数应用的最佳做法。The following are some best practices to ensure optimal scalability of a function app.

共享和管理连接Share and manage connections

只要可能,请重用与外部资源的连接。Reuse connections to external resources whenever possible. 请参阅如何管理 Azure Functions 中的连接See how to manage connections in Azure Functions.

避免共享存储帐户Avoid sharing storage accounts

创建函数应用时,必须将其与存储帐户相关联。When you create a function app, you must associate it with a storage account. 存储帐户连接在 AzureWebJobsStorage 应用程序设置中维护。The storage account connection is maintained in the AzureWebJobsStorage application setting.

若要最大程度地提高性能,请对每个函数应用使用单独的存储帐户。To maximize performance, use a separate storage account for each function app. 如果有 Durable Functions 或事件中心触发的函数,则请注意,这两种函数都会产生大量存储事务,这一点特别重要。This is particularly important when you have Durable Functions or Event Hub triggered functions, which both generate a high volume of storage transactions. 当应用程序逻辑与 Azure 存储交互时,无论是直接(使用存储 SDK)交互还是通过某个存储绑定进行交互,都应使用专用存储帐户。When your application logic interacts with Azure Storage, either directly (using the Storage SDK) or through one of the storage bindings, you should use a dedicated storage account. 例如,如果有事件中心触发的函数将一些数据写入 blob 存储,请使用两个存储帐户—一个用于函数应用,另一个用于由函数存储的 blob。For example, if you have an Event Hub-triggered function writing some data to blob storage, use two storage accounts—one for the function app and another for the blobs being stored by the function.

请勿在同一函数应用中混合测试和生产代码Don't mix test and production code in the same function app

Function App 中的各函数共享资源。Functions within a function app share resources. 例如,共享内存。For example, memory is shared. 如果生产中使用的是 Function App,则请勿向其添加与测试相关的函数和资源。If you're using a function app in production, don't add test-related functions and resources to it. 生产代码执行期间,这可能会导致意外的开销。It can cause unexpected overhead during production code execution.

请注意在生产 Function App 中加载的内容。Be careful what you load in your production function apps. 将内存平均分配给应用中的每个函数。Memory is averaged across each function in the app.

如果在多个 .NET 函数中引用共享程序集,请将其放在常用的共享文件夹中。If you have a shared assembly referenced in multiple .NET functions, put it in a common shared folder. 否则,可能会意外部署在函数之间表现不同的同一二进制的多个版本。Otherwise, you could accidentally deploy multiple versions of the same binary that behave differently between functions.

请勿在生产代码中使用详细日志记录,否则会对性能产生负面影响。Don't use verbose logging in production code, which has a negative performance impact.

使用异步代码,但避免阻止调用Use async code but avoid blocking calls

异步编程是推荐的最佳做法,在涉及到阻止 I/O 操作时更是如此。Asynchronous programming is a recommended best practice, especially when blocking I/O operations are involved.

在 C# 中,请始终避免引用 Result 属性或在 Task 实例上调用 Wait 方法。In C#, always avoid referencing the Result property or calling Wait method on a Task instance. 这种方法会导致线程耗尽。This approach can lead to thread exhaustion.


如果计划使用 HTTP 或 WebHook 绑定,请制定计划来避免因实例化 HttpClient 不当导致的端口耗尽现象。If you plan to use the HTTP or WebHook bindings, plan to avoid port exhaustion that can be caused by improper instantiation of HttpClient. 有关详细信息,请参阅如何在 Azure Functions 中管理连接For more information, see How to manage connections in Azure Functions.

使用多个工作进程Use multiple worker processes

默认情况下,Functions 的任何主机实例均使用单个工作进程。By default, any host instance for Functions uses a single worker process. 若要提高性能,尤其是使用单线程运行时的性能,请使用 FUNCTIONS_WORKER_PROCESS_COUNT 增加每个主机的工作进程数(最多 10 个)。To improve performance, especially with single-threaded runtimes, use the FUNCTIONS_WORKER_PROCESS_COUNT to increase the number of worker processes per host (up to 10). 然后,Azure Functions 会尝试在这些工作进程之间平均分配同步函数调用。Azure Functions then tries to evenly distribute simultaneous function invocations across these workers.

FUNCTIONS_WORKER_PROCESS_COUNT 适用于 Functions 在横向扩展应用程序以满足需求时创建的每个主机。The FUNCTIONS_WORKER_PROCESS_COUNT applies to each host that Functions creates when scaling out your application to meet demand.

尽量批量接收消息Receive messages in batch whenever possible

某些触发器(例如事件中心)允许通过单次调用接收一批消息。Some triggers like Event Hub enable receiving a batch of messages on a single invocation. 批处理消息可大幅提升性能。Batching messages has much better performance. 可以根据 host.json 参考文档中的详述,在 host.json 文件中配置最大批大小You can configure the max batch size in the host.json file as detailed in the host.json reference documentation

对于 C# 函数,可将类型更改为强类型化数组。For C# functions, you can change the type to a strongly-typed array. 例如,方法签名可以是 EventData[] sensorEvent,而不是 EventData sensorEventFor example, instead of EventData sensorEvent the method signature could be EventData[] sensorEvent. 对于其他语言,需要根据此文所述,在 function.json 中将基数属性显式设置为 many,以启用批处理。For other languages, you'll need to explicitly set the cardinality property in your function.json to many in order to enable batching as shown here.

配置主机行为以更好地处理并发性Configure host behaviors to better handle concurrency

使用函数应用中的 host.json 文件可以配置主机运行时和触发器行为。The host.json file in the function app allows for configuration of host runtime and trigger behaviors. 除了批处理行为以外,还可以管理大量触发器的并发性。In addition to batching behaviors, you can manage concurrency for a number of triggers. 调整这些选项中的值往往有助于每个实例根据被调用函数的需求适当缩放。Often adjusting the values in these options can help each instance scale appropriately for the demands of the invoked functions.

host.json 文件中的设置应用于应用中的所有函数,以及函数的单个实例。 Settings in the host.json file apply across all functions within the app, within a single instance of the function. 例如,如果有包含两个 HTTP 函数的函数应用,并且 maxConcurrentRequests 请求设置为 25,则针对任一 HTTP 触发器发出的请求将计入 25 个共享的并发请求。For example, if you had a function app with two HTTP functions and maxConcurrentRequests requests set to 25, a request to either HTTP trigger would count towards the shared 25 concurrent requests. 如果该函数应用扩展到 10 个实例,则两个函数会有效地允许 250 个并发请求(10 个实例 * 每个实例 25 个并发请求)。When that function app is scaled to 10 instances, the two functions effectively allow 250 concurrent requests (10 instances * 25 concurrent requests per instance).

可在 host.json 配置文章在找到其他主机配置选项。Other host configuration options are found in the host.json configuration article.

后续步骤Next steps

有关详细信息,请参阅以下资源:For more information, see the following resources: