在 Azure 逻辑应用中处理错误和异常Handle errors and exceptions in Azure Logic Apps

恰当地处理依赖系统造成的停机或问题对任何集成体系结构而言可能都是一个难题。The way that any integration architecture appropriately handles downtime or issues caused by dependent systems can pose a challenge. 为了帮助创建可适当处理问题和故障的可靠、可复原集成,逻辑应用提供了用于处理错误和异常的一流体验。To help you create robust and resilient integrations that gracefully handle problems and failures, Logic Apps provides a first-class experience for handling errors and exceptions.

重试策略Retry policies

对于最基本的异常和错误处理,可在支持的任何操作或触发器中使用重试策略 。For the most basic exception and error handling, you can use a retry policy in any action or trigger where supported. 重试策略指定当原始请求(导致 408、429 或 5xx 响应的任何请求)超时或失败时,操作或触发器是否且如何重试请求。A retry policy specifies whether and how the action or trigger retries a request when the original request times out or fails, which is any request that results in a 408, 429, or 5xx response. 如果未使用任何其他重试策略,则使用默认策略。If no other retry policy is used, the default policy is used.

重试策略类型如下所示:Here are the retry policy types:

类型Type 说明Description
DefaultDefault 此策略可按指数级增长间隔发送最多 4 次重试,增幅为 7.5 秒,但范围限定在 5 到 45 秒之间。This policy sends up to four retries at exponentially increasing intervals, which scale by 7.5 seconds but are capped between 5 and 45 seconds.
指数间隔Exponential interval 此策略会等待从指数增长的范围中随机选定的时间间隔,然后再发送下一个请求。This policy waits a random interval selected from an exponentially growing range before sending the next request.
固定间隔Fixed interval 此策略会等待指定的时间间隔,然后再发送下一个请求。This policy waits the specified interval before sending the next request.
None 不重新发送请求。Don't resend the request.

要了解重试策略限制,请参阅逻辑应用限制和配置For information about retry policy limits, see Logic Apps limits and configuration.

更改重试策略Change retry policy

要选择其他重试策略,请执行以下步骤:To select a different retry policy, follow these steps:

  1. 在逻辑应用设计器中打开逻辑应用。Open your logic app in Logic App Designer.

  2. 打开操作或触发器的“设置” 。Open the Settings for an action or trigger.

  3. 如果操作或触发器支持重试策略,请在“重试策略”下选择所需类型 。If the action or trigger supports retry policies, under Retry Policy, select the type you want.

或者,可以在支持重试策略的操作或触发器的 inputs 部分指定重试策略。Or, you can manually specify the retry policy in the inputs section for an action or trigger that supports retry policies. 如果不指定重试策略,该操作将使用默认策略。If you don't specify a retry policy, the action uses the default policy.

"<action-name>": {
   "type": "<action-type>", 
   "inputs": {
      "<action-specific-inputs>",
      "retryPolicy": {
         "type": "<retry-policy-type>",
         "interval": "<retry-interval>",
         "count": <retry-attempts>,
         "minimumInterval": "<minimum-interval>",
         "maximumInterval": "<maximum-interval>"
      },
      "<other-action-specific-inputs>"
   },
   "runAfter": {}
}

必需Required

Value 类型Type 说明Description
<retry-policy-type><retry-policy-type> StringString 要使用的重试策略类型:defaultnonefixedexponentialThe retry policy type you want to use: default, none, fixed, or exponential
<retry-interval><retry-interval> StringString 其中值必须使用 ISO 8601 格式的重试时间间隔。The retry interval where the value must use ISO 8601 format. 默认的最小时间间隔是 PT5S,而最大时间间隔是 PT1DThe default minimum interval is PT5S and the maximum interval is PT1D. 如果使用指数式时间间隔策略,可更改最小值和最大值。When you use the exponential interval policy, you can specify different minimum and maximum values.
<retry-attempts><retry-attempts> IntegerInteger 重试尝试次数,它必须介于 1 和 90 之间The number of retry attempts, which must be between 1 and 90

可选Optional

Value 类型Type 说明Description
<minimum-interval><minimum-interval> StringString 对于指数式时间间隔策略,是指随机选定的时间间隔的最小时间间隔(采用 ISO 8601 格式For the exponential interval policy, the smallest interval for the randomly selected interval in ISO 8601 format
<maximum-interval><maximum-interval> StringString 对于指数式时间间隔策略,是指随机选定的时间间隔的最大时间间隔(采用 ISO 8601 格式For the exponential interval policy, the largest interval for the randomly selected interval in ISO 8601 format

下面详细介绍不同的策略类型。Here is more information about the different policy types.

默认Default

如果不指定重试策略,则操作将使用默认策略,它实际上是一个指数式时间间隔策略,按指数级增长的时间间隔(以 7.5 秒为增幅)最多发送 4 次重试。If you don't specify a retry policy, the action uses the default policy, which is actually an exponential interval policy that sends up to four retries at exponentially increasing intervals that are scaled by 7.5 seconds. 时间间隔的范围为 5 到 45 秒。The interval is capped between 5 and 45 seconds.

尽管未在操作或触发器中显式定义,但默认策略在示例 HTTP 操作中的行为方式如下所示:Though not explicitly defined in your action or trigger, here is how the default policy behaves in an example HTTP action:

"HTTP": {
   "type": "Http",
   "inputs": {
      "method": "GET",
      "uri": "http://myAPIendpoint/api/action",
      "retryPolicy" : {
         "type": "exponential",
         "interval": "PT7S",
         "count": 4,
         "minimumInterval": "PT5S",
         "maximumInterval": "PT1H"
      }
   },
   "runAfter": {}
}

None

要指定操作或触发器不重试失败的请求,请将 设置为 noneTo specify that the action or trigger doesn't retry failed requests, set the <retry-policy-type> to none.

固定间隔Fixed interval

要指定操作或触发器在等待指定的时间间隔后再发送下一个请求,请将 设置为 fixedTo specify that the action or trigger waits the specified interval before sending the next request, set the <retry-policy-type> to fixed.

示例Example

该重试策略在首次请求失败后再尝试获取最新资讯两次,每次尝试之间延迟 30 秒:This retry policy attempts to get the latest news two more times after the first failed request with a 30-second delay between each attempt:

"Get_latest_news": {
   "type": "Http",
   "inputs": {
      "method": "GET",
      "uri": "https://mynews.example.com/latest",
      "retryPolicy": {
         "type": "fixed",
         "interval": "PT30S",
         "count": 2
      }
   }
}

指数间隔Exponential interval

要指定操作或触发器在等待随机的时间间隔后再发送下一个请求,请将 设置为 exponentialTo specify that the action or trigger waits a random interval before sending the next request, set the <retry-policy-type> to exponential. 随机时间间隔选自呈指数增长的范围。The random interval is selected from an exponentially growing range. 此外,可通过自行指定最小时间间隔和最大时间间隔替代默认的最小和最大时间间隔。Optionally, you can also override the default minimum and maximum intervals by specifying your own minimum and maximum intervals.

随机变量范围Random variable ranges

下表展示了逻辑应用如何为每次重试生成指定范围内的一个统一随机变量(不超过重试次数):This table shows how Logic Apps generates a uniform random variable in the specified range for each retry up to and including the number of retries:

重试次数Retry number 最小间隔Minimum interval 最大间隔Maximum interval
11 max(0, ) max(0, <minimum-interval>) min(interval, ) min(interval, <maximum-interval>)
22 max(interval, ) max(interval, <minimum-interval>) min(2 * interval, ) min(2 * interval, <maximum-interval>)
33 max(2 * interval, ) max(2 * interval, <minimum-interval>) min(4 * interval, ) min(4 * interval, <maximum-interval>)
44 max(4 * interval, ) max(4 * interval, <minimum-interval>) min(8 * interval, ) min(8 * interval, <maximum-interval>)
........ ........ ........

通过更改“随后运行”行为来捕获和处理失败Catch and handle failures by changing "run after" behavior

在逻辑应用设计器中添加操作时,需隐式声明用于运行这些操作的顺序。When you add actions in the Logic App Designer, you implicitly declare the order to use for running those actions. 某个操作完成运行后,该操作将标记为 SucceededFailedSkippedTimedOut 等状态。After an action finishes running, that action is marked with a status such as Succeeded, Failed, Skipped, or TimedOut. 在每个操作定义中,runAfter 属性指定必须先完成的前置操作,以及在后继操作能够运行之前,该前置操作允许的状态。In each action definition, the runAfter property specifies the predecessor action that must first finish and the statuses permitted for that predecessor before the successor action can run. 默认情况下,在设计器中添加的操作只会在前置操作已完成并且状态为 Succeeded 之后才运行。By default, an action that you add in the designer runs only after the predecessor completes with Succeeded status.

当某个操作引发了未经处理的错误或异常时,该操作将标记为 Failed,而任何后继操作将标记为 SkippedWhen an action throws an unhandled error or exception, the action is marked Failed, and any successor action is marked Skipped. 如果具有并行分支的操作发生此行为,逻辑应用引擎将跟踪其他分支来确定其完成状态。If this behavior happens for an action that has parallel branches, the Logic Apps engine follows the other branches to determine their completion statuses. 例如,如果某个分支以 Skipped 操作结束,该分支的完成状态基于该已跳过操作的前置操作的状态。For example, if a branch ends with a Skipped action, that branch's completion status is based on that skipped action's predecessor status. 逻辑应用运行完成后,引擎将通过评估所有分支的状态来确定整个运行的状态。After the logic app run completes, the engine determines the entire run's status by evaluating all the branch statuses. 如果任一分支失败,整个逻辑应用运行将标记为 FailedIf any branch ends in failure, the entire logic app run is marked Failed.

演示如何评估运行状态的示例

为了确保无论前置操作状态如何,某个操作都可运行,请自定义操作的“随后运行”行为,以处理前置操作的不成功状态。To make sure that an action can still run despite its predecessor's status, customize an action's "run after" behavior to handle the predecessor's unsuccessful statuses.

自定义“随后运行”行为Customize "run after" behavior

可以自定义某个操作的“随后运行”行为,使该操作在前置操作的状态为 SucceededFailedSkippedTimedOut 时均可运行。You can customize an action's "run after" behavior so that the action runs when the predecessor's status is either Succeeded, Failed, Skipped, TimedOut, or any of these statuses. 例如,若要在 Excel Online Add_a_row_into_a_table 前置操作标记为 Failed(而不是 Succeeded)后发送电子邮件,请遵循以下任一步骤更改“随后运行”行为:For example, to send an email after the Excel Online Add_a_row_into_a_table predecessor action is marked Failed, rather than Succeeded, change the "run after" behavior by following either step:

  • 在设计视图中,选择省略号 ( ... ) 按钮,然后选择“配置随后运行”。 In the design view, select the ellipses (...) button, and then select Configure run after.

    为操作配置“随后运行”行为

    操作形状将显示前置操作所需的默认状态,在本示例中为“在表中插入新行”: The action shape shows the default status that's required for the predecessor action, which is Add a row into a table in this example:

    操作的默认“随后运行”行为

    将“随后运行”行为更改为所需状态,在本示例中为“失败”: Change the "run after" behavior to the status that you want, which is has failed in this example:

    将“随后运行”行为更改为“失败”

    若要指定无论前置操作是标记为 FailedSkipped 还是 TimedOut,该操作都会运行,请选择其他状态:To specify that the action runs whether the predecessor action is marked as Failed, Skipped or TimedOut, select the other statuses:

    将“随后运行”行为更改为出现任何其他状态

  • 在代码视图中的操作 JSON 定义内,遵循以下语法编辑 runAfter 属性:In code view, in the action's JSON definition, edit the runAfter property, which follows this syntax:

    "<action-name>": {
       "inputs": {
          "<action-specific-inputs>"
       },
       "runAfter": {
          "<preceding-action>": [
             "Succeeded"
          ]
       },
       "type": "<action-type>"
    }
    

    对于本示例,请将 runAfter 属性从 Succeeded 更改为 FailedFor this example, change the runAfter property from Succeeded to Failed:

    "Send_an_email_(V2)": {
       "inputs": {
          "body": {
             "Body": "<p>Failed to&nbsp;add row to &nbsp;@{body('Add_a_row_into_a_table')?['Terms']}</p>",,
             "Subject": "Add row to table failed: @{body('Add_a_row_into_a_table')?['Terms']}",
             "To": "Sophia.Owen@fabrikam.com"
          },
          "host": {
             "connection": {
                "name": "@parameters('$connections')['office365']['connectionId']"
             }
          },
          "method": "post",
          "path": "/v2/Mail"
       },
       "runAfter": {
          "Add_a_row_into_a_table": [
             "Failed"
          ]
       },
       "type": "ApiConnection"
    }
    

    若要指定无论前置操作是标记为 FailedSkipped 还是 TimedOut,该操作都会运行,请添加其他状态:To specify that the action runs whether the predecessor action is marked as Failed, Skipped or TimedOut, add the other statuses:

    "runAfter": {
       "Add_a_row_into_a_table": [
          "Failed", "Skipped", "TimedOut"
       ]
    },
    

评估具有作用域的操作及其结果Evaluate actions with scopes and their results

类似于使用 runAfter 属性在单个操作之后运行步骤,可以将操作在某个范围内分组在一起。Similar to running steps after individual actions with the runAfter property, you can group actions together inside a scope. 如果希望以逻辑方式将各个操作组合在一起,可以使用作用域,评估作用域的聚合状态,并基于该状态执行操作。You can use scopes when you want to logically group actions together, assess the scope's aggregate status, and perform actions based on that status. 当某个作用域中的所有操作都完成运行后,该作用域本身也确定了其自己的状态。After all the actions in a scope finish running, the scope itself gets its own status.

若要检查范围的状态,可以使用与用来检查逻辑应用运行状态(例如 SucceededFailed 等等)的条件相同的条件。To check a scope's status, you can use the same criteria that you use to check a logic app's run status, such as Succeeded, Failed, and so on.

默认情况下,当范围的所有操作都成功时,范围的状态将标记为 SucceededBy default, when all the scope's actions succeed, the scope's status is marked Succeeded. 如果范围内最后一个操作的状态为 FailedAborted,则范围的状态将标记为 FailedIf the final action in a scope results as Failed or Aborted, the scope's status is marked Failed.

若要捕获 Failed 范围内的异常并运行用来处理这些错误的操作,可对该 Failed 范围使用 runAfter 属性。To catch exceptions in a Failed scope and run actions that handle those errors, you can use the runAfter property for that Failed scope. 这样,如果范围内的任何操作失败并且为该范围使用了 runAfter 属性,则可以创建单个操作来捕获失败。That way, if any actions in the scope fail, and you use the runAfter property for that scope, you can create a single action to catch failures.

有关作用域的限制,请参阅限制和配置For limits on scopes, see Limits and config.

获取失败的上下文和结果Get context and results for failures

尽管从作用域中捕获失败非常有用,但可能还需要借助上下文来确切了解失败的操作以及返回的任何错误或状态代码。Although catching failures from a scope is useful, you might also want context to help you understand exactly which actions failed plus any errors or status codes that were returned.

result() 函数提供了有关作用域中所有操作的结果的上下文。The result() function provides context about the results from all the actions in a scope. result() 函数接受单个参数(即作用域的名称),并返回一个数组,其中包含该作用域中的所有操作结果。The result() function accepts a single parameter, which is the scope's name, and returns an array that contains all the action results from within that scope. 这些操作对象包括与 actions() 对象相同的属性,例如操作的开始时间、结束时间、状态、输入、相关 ID 和输出。These action objects include the same attributes as the actions() object, such as the action's start time, end time, status, inputs, correlation IDs, and outputs. 若要发送作用域内失败的任何操作的上下文,可以轻松将 @result() 表达式与 runAfter 属性配对。To send context for any actions that failed within a scope, you can easily pair a @result() expression with the runAfter property.

若要为范围中具有 Failed 结果的每个操作运行操作,并将结果数组筛选到失败的操作,可将 @result() 表达式与筛选器数组操作和 For each 循环配对。To run an action for each action in a scope that has a Failed result, and to filter the array of results down to the failed actions, you can pair a @result() expression with a Filter Array action and a For each loop. 可以采用筛选的结果数组并使用 For_each 循环对每个失败执行操作。You can take the filtered result array and perform an action for each failure using the For_each loop.

以下示例(后附详细说明)发送一个 HTTP POST 请求,其中包含范围内“My_Scope”中失败的任何操作的响应正文:Here's an example, followed by a detailed explanation, that sends an HTTP POST request with the response body for any actions that failed within the scope "My_Scope":

"Filter_array": {
   "type": "Query",
   "inputs": {
      "from": "@result('My_Scope')",
      "where": "@equals(item()['status'], 'Failed')"
   },
   "runAfter": {
      "My_Scope": [
         "Failed"
      ]
    }
},
"For_each": {
   "type": "foreach",
   "actions": {
      "Log_exception": {
         "type": "Http",
         "inputs": {
            "method": "POST",
            "body": "@item()['outputs']['body']",
            "headers": {
               "x-failed-action-name": "@item()['name']",
               "x-failed-tracking-id": "@item()['clientTrackingId']"
            },
            "uri": "http://requestb.in/"
         },
         "runAfter": {}
      }
   },
   "foreach": "@body('Filter_array')",
   "runAfter": {
      "Filter_array": [
         "Succeeded"
      ]
   }
}

下面是描述此示例中发生的情况的详细演练:Here's a detailed walkthrough that describes what happens in this example:

  1. 要获取“My_Scope”中所有操作的结果,筛选数组操作将使用筛选表达式:@result('My_Scope')To get the result from all actions inside "My_Scope", the Filter Array action uses this filter expression: @result('My_Scope')

  2. 筛选数组的条件是状态等于“Failed”的任何 @result() 项 。The condition for Filter Array is any @result() item that has a status equal to Failed. 此条件将具有“My_Scope”中所有操作结果的数组筛选为仅包含失败操作结果的数组。This condition filters the array that has all the action results from "My_Scope" down to an array with only the failed action results.

  3. 对“筛选后的数组”输出执行 For Each 循环操作 。Perform a For each loop action on the filtered array outputs. 此步骤针对前面筛选的每个失败操作结果执行操作。This step performs an action for each failed action result that was previously filtered.

    如果范围中的单个操作失败,For Each 循环中的操作仅运行一次 。If a single action in the scope failed, the actions in the For each loop run only once. 如果存在多个失败的操作,则将对每个失败执行一次操作。Multiple failed actions cause one action per failure.

  4. 针对 For Each 项响应正文(即 @item()['outputs']['body'] 表达式)发送 HTTP POST。Send an HTTP POST on the For each item response body, which is the @item()['outputs']['body'] expression.

    @result() 项的形状与 @actions() 形状相同,可按相同的方式进行分析。The @result() item shape is the same as the @actions() shape and can be parsed the same way.

  5. 包括两个自定义标头,其中包含失败操作的名称 (@item()['name']) 和失败的运行客户端跟踪 ID (@item()['clientTrackingId'])。Include two custom headers with the failed action name (@item()['name']) and the failed run client tracking ID (@item()['clientTrackingId']).

下面提供了一个 @result() 项的示例供参考,其中显示了在上一示例中分析过的 name、body 和 clientTrackingId 属性 。For reference, here's an example of a single @result() item, showing the name, body, and clientTrackingId properties that are parsed in the previous example. For Each 操作外部,@result() 会返回这些对象的数组。Outside a For each action, @result() returns an array of these objects.

{
   "name": "Example_Action_That_Failed",
   "inputs": {
      "uri": "https://myfailedaction.azurewebsites.net",
      "method": "POST"
   },
   "outputs": {
      "statusCode": 404,
      "headers": {
         "Date": "Thu, 11 Aug 2016 03:18:18 GMT",
         "Server": "Microsoft-IIS/8.0",
         "X-Powered-By": "ASP.NET",
         "Content-Length": "68",
         "Content-Type": "application/json"
      },
      "body": {
         "code": "ResourceNotFound",
         "message": "/docs/folder-name/resource-name does not exist"
      }
   },
   "startTime": "2016-08-11T03:18:19.7755341Z",
   "endTime": "2016-08-11T03:18:20.2598835Z",
   "trackingId": "bdd82e28-ba2c-4160-a700-e3a8f1a38e22",
   "clientTrackingId": "08587307213861835591296330354",
   "code": "NotFound",
   "status": "Failed"
}

若要执行不同的异常处理模式,可以使用本文中前面所述的表达式。To perform different exception handling patterns, you can use the expressions previously described in this article. 可选择在范围外执行单个异常处理操作,使此操作接受筛选后的整个失败数组,然后再删除 For Each 操作 。You might choose to execute a single exception handling action outside the scope that accepts the entire filtered array of failures, and remove the For each action. 如前面所述,还可以包含 @result() 响应中其他有用的属性。You can also include other useful properties from the @result() response as previously described.

Azure 诊断和指标Azure Diagnostics and metrics

上述模式非常适合于处理运行中的错误和异常,不过也可以独立于运行本身来标识和响应错误。The previous patterns are great way to handle errors and exceptions within a run, but you can also identify and respond to errors independent of the run itself. Azure 诊断提供了一种简单方式,用来将所有工作流事件(包括所有运行和操作状态)发送到 Azure 存储帐户或通过 Azure 事件中心创建的事件中心。Azure Diagnostics provides a simple way to send all workflow events, including all run and action statuses, to an Azure Storage account or an event hub created with Azure Event Hubs.

要评估运行状态,可以监视日志和指标,或将它们发布到你习惯使用的任何监视工具中。To evaluate run statuses, you can monitor the logs and metrics, or publish them into any monitoring tool that you prefer. 一种潜在选项是通过事件中心将所有事件流式传输到 Azure 流分析One potential option is to stream all the events through Event Hubs into Azure Stream Analytics. 在流分析中,可以根据诊断日志中的任何异常、平均值或失败编写实时查询。In Stream Analytics, you can write live queries based on any anomalies, averages, or failures from the diagnostic logs. 可以使用流分析将信息发送到其他数据源,例如队列、主题、SQL、Azure Cosmos DB 或 Power BI。You can use Stream Analytics to send information to other data sources, such as queues, topics, SQL, Azure Cosmos DB, or Power BI.

后续步骤Next steps