有效地横向扩展自定义技能Efficiently scale out a custom skill

自定义技能是实现特定接口的 Web API。Custom skills are web APIs that implement a specific interface. 可以在任何可公开寻址的资源上实现自定义技能。A custom skill can be implemented on any publicly addressable resource. 最常见的自定义技能实现包括:The most common implementations for custom skills are:

  • 用于自定义逻辑技能的 Azure FunctionsAzure Functions for custom logic skills
  • 用于简单容器化 AI 技能的 Azure WebappsAzure Webapps for simple containerized AI skills
  • 用于更复杂或更大技能的 Azure Kubernetes 服务。Azure Kubernetes service for more complex or larger skills.

先决条件Prerequisites

  • 查看自定义技能接口,其中介绍了自定义技能应该实现的输入/输出接口。Review the custom skill interface for an introduction into the input/output interface that a custom skill should implement.

  • 设置环境Set up your environment. 你可以从此端到端教程着手,以使用 Visual Studio Code 和 Python 扩展设置无服务器 Azure 函数。You could start with this tutorial end-to-end to set up serverless Azure Function using Visual Studio Code and Python extensions.

技能集配置Skillset configuration

若要配置自定义技能以最大程度地提高索引过程的吞吐量,需要了解技能、索引器配置以及技能与每个文档的关系。Configuring a custom skill for maximizing throughput of the indexing process requires an understanding of the skill, indexer configurations and how the skill relates to each document. 例如,每个文档调用技能的次数和每次调用的预期持续时间。For example, the number of times a skill is invoked per document and the expected duration per invocation.

技能设置Skill settings

自定义技能上,设置以下参数。On the custom skill set the following parameters.

  1. 设置自定义技能的 batchSize,以配置在一次调用技能时发送到该技能的记录数。Set batchSize of the custom skill to configure the number of records sent to the skill in a single invocation of the skill.

  2. 设置 degreeOfParallelism 以校准索引器将对你的技能发出的并发请求数。Set the degreeOfParallelism to calibrate the number of concurrent requests the indexer will make to your skill.

  3. timeout 设置为技能足以做出有效响应的值。Set timeoutto a value sufficient for the skill to respond with a valid response.

  4. indexer 定义中,将 batchSize 设置为应从数据源读取并同时扩充的文档数。In the indexer definition, set batchSize to the number of documents that should be read from the data source and enriched concurrently.

注意事项Considerations

若要设置这些变量以优化索引器性能,需要确定技能在处理多个并发小请求还是较少的大请求时表现更佳。Setting these variables to optimize the indexers performance requires determining if your skill performs better with many concurrent small requests or fewer large requests. 需要考虑以下几个问题:A few questions to consider are:

  • 什么是技能调用基数?What is the skill invocation cardinality? 该技能是针对每个文档执行一次(例如文档分类技能),还是针对每个文档执行多次(例如段落分类技能)?Does the skill execute once for each document, for instance a document classification skill, or could the skill execute multiple times per document, a paragraph classification skill?

  • 根据技能批大小,平均要从数据源读取多少文档来填充技能请求?On average how many documents are read from the data source to fill out a skill request based on the skill batch size? 理想情况下,此值应小于索引器批大小。Ideally, this should be less than the indexer batch size. 如果批大小大于 1,你的技能可以从多个源文档接收记录。With batch sizes greater than 1 your skill can receive records from multiple source documents. 例如,如果索引器批计数为 5,而技能批计数为 50,并且每个文档只生成 5 条记录,索引器需要跨多个索引器批填充自定义技能请求。For example if the indexer batch count is 5 and the skill batch count is 50 and each document generates only five records, the indexer will need to fill a custom skill request across multiple indexer batches.

  • 索引器批可生成的平均请求数应为你提供并行度的最佳设置。The average number of requests an indexer batch can generate should give you an optimal setting for the degrees of parallelism. 如果托管该技能的基础结构无法支持这种并发级别,请考虑降低并行度。If your infrastructure hosting the skill cannot support that level of concurrency, consider dialing down the degrees of parallelism. 作为最佳做法,请使用几个文档测试你的配置,以验证你对参数的选择。As a best practice, test your configuration with a few documents to validate your choices on the parameters.

  • 使用较小的文档示例进行测试,评估技能的执行时间与处理文档子集所花费的总时间。Testing with a smaller sample of documents, evaluate the execution time of your skill to the overall time taken to process the subset of documents. 索引器是否花费更多时间来生成批处理或等待技能的响应?Does your indexer spend more time building a batch or waiting for a response from your skill?

  • 请考虑并行度的上游含义。Consider the upstream implications of parallelism. 如果自定义技能的输入是先前技能的输出,则是否有效横向扩展了技能组中的所有技能以最大程度地减少延迟?If the input to a custom skill is an output from a prior skill, are all the skills in the skillset scaled out effectively to minimize latency?

自定义技能中的错误处理Error handling in the custom skill

技能成功完成后,自定义技能应返回成功状态代码 HTTP 200。Custom skills should return a success status code HTTP 200 when the skill completes successfully. 如果批中的一个或多个记录导致错误,请考虑返回多状态代码 207。If one or more records in a batch result in errors, consider returning multi-status code 207. 记录的错误或警告列表应包含相应的消息。The errors or warnings list for the record should contain the appropriate message.

批中的任何项目出错都将导致相应文档失败。Any items in a batch that errors will result in the corresponding document failing. 如果需要文档成功,则返回警告。If you need the document to succeed, return a warning.

超过 299 的任何状态代码都将视为错误,并且所有扩充操作均失败,从而导致文档失败。Any status code over 299 is evaluated as an error and all the enrichments are failed resulting in a failed document.

常见错误消息Common error messages

  • Could not execute skill because it did not execute within the time limit '00:00:30'. This is likely transient. Please try again later. For custom skills, consider increasing the 'timeout' parameter on your skill in the skillset. 在技能上设置超时参数,以允许更长的执行时间。Could not execute skill because it did not execute within the time limit '00:00:30'. This is likely transient. Please try again later. For custom skills, consider increasing the 'timeout' parameter on your skill in the skillset. Set the timeout parameter on the skill to allow for a longer execution duration.

  • Could not execute skill because Web Api skill response is invalid. 指示技能未以自定义技能响应格式返回消息。Could not execute skill because Web Api skill response is invalid. Indicative of the skill not returning a message in the custom skill response format. 这可能是因为技能中出现未捕获的异常。This could be the result of an uncaught exception in the skill.

  • Could not execute skill because the Web Api request failed. 很可能是由授权错误或异常引起的。Could not execute skill because the Web Api request failed. Most likely caused by authorization errors or exceptions.

  • Could not execute skill. 通常是因为技能响应映射到文档层次结构中的现有属性。Could not execute skill. Commonly the result of the skill response being mapped to an existing property in the document hierarchy.

测试自定义技能Testing custom skills

首先用 REST API 客户端测试自定义技能,以验证:Start by testing your custom skill with a REST API client to validate:

  • 技能为请求和响应实现自定义技能接口The skill implements the custom skill interface for requests and responses

  • 技能返回 application/JSON MIME 类型的有效 JSONThe skill returns valid JSON with the application/JSON MIME type

  • 返回有效的 HTTP 状态代码Returns a valid HTTP status code

创建一个调试会话,将你的技能添加到技能集中,并确保它生成有效的扩充。Create a debug session to add your skill to the skillset and make sure it produces a valid enrichment. 虽然调试会话不允许你调整技能的性能,但通过该会话可确保使用有效值配置技能并返回预期的扩充对象。While a debug session does not allow you to tune the performance of the skill, it enables you to ensure that the skill is configured with valid values and returns the expected enriched objects.

最佳做法Best practices

  • 尽管技能可以接受和返回更大的有效负载,但在返回 JSON 时,请考虑将响应限制为 150 MB 或更低。While skills can accept and return larger payloads, consider limiting the response to 150 MB or less when returning JSON.

  • 请考虑在索引器和技能上设置批大小,以确保每个数据源批为你的技能生成一个完整的有效负载。Consider setting the batch size on the indexer and skill to ensure that each data source batch generates a full payload for your skill.

  • 对于长期任务,请将超时设置为足够大的值,以确保索引器在并发处理文档时不会出错。For long running tasks, set the timeout to a high enough value to ensure the indexer does not error out when processing documents concurrently.

  • 优化索引器批大小、技能批大小和技能并行度,以生成技能期望的负载模式(较少的大型请求或多个小型请求)。Optimize the indexer batch size, skill batch size, and skill degrees of parallelism to generate the load pattern your skill expects, fewer large requests or many small requests.

  • 使用详细的故障日志监视自定义技能,因为你可能会遇到由于数据可变性导致特定请求不断失败的情况。Monitor custom skills with detailed logs of failures as you can have scenarios where specific requests consistently fail as a result of the data variability.

后续步骤Next steps

祝贺你!Congratulations! 你的自定义技能现已缩放到索引器上的最大吞吐量。Your custom skill is now scaled right to maximize throughput on the indexer.