运行情况探测Liveness probe

从版本 7.1 开始,Azure Service Fabric 支持针对容器化应用程序的运行情况探测机制。Starting with version 7.1, Azure Service Fabric supports a liveness probe mechanism for containerized applications. 运行情况探测有助于报告容器化应用程序的运行情况,如果应用程序没有快速响应,它将重启。A liveness probe helps to report the liveness of a containerized application, which will restart if it doesn't respond quickly. 本文概述了如何通过使用清单文件定义运行情况探测。This article provides an overview of how to define a liveness probe by using manifest files.

继续阅读本文之前,请熟悉 Service Fabric 应用程序模型Service Fabric 托管模型Before you proceed with this article, become familiar with the Service Fabric application model and the Service Fabric hosting model.

备注

只有 NAT 网络模式下的容器支持运行情况探测。Liveness probe is supported only for containers in NAT networking mode.

语义Semantics

只能为每个容器指定 1 个运行情况探测,可以通过以下字段控制其行为:You can specify only one liveness probe per container and can control its behavior by using these fields:

  • initialDelaySeconds:容器启动后开始执行探测的初始延迟秒数。initialDelaySeconds: The initial delay in seconds to start executing the probe after the container has started. 支持的值为整数。默认值为 0,最小值为 0。The supported value is int. The default is 0 and the minimum is 0.

  • timeoutSeconds:一个时间段(秒),在此时间之后,如果探测未成功完成,则会将其视为失败。timeoutSeconds: The period in seconds after which we consider the probe as failed, if it hasn't finished successfully. 支持的值为整数。默认值为 1,最小值为 1。The supported value is int. The default is 1 and the minimum is 1.

  • periodSeconds:用于指定探测频率的时间段(秒)。periodSeconds: The period in seconds to specify the frequency of the probe. 支持的值为整数。默认值为 10,最小值为 1。The supported value is int. The default is 10 and the minimum is 1.

  • failureThreshold:当达到此值时,容器将重启。failureThreshold: When we hit this value, the container will restart. 支持的值为整数。默认值为 3,最小值为 1。The supported value is int. The default is 3 and the minimum is 1.

  • successThreshold:失败后,若要将探测视为成功,则必须成功运行此值。successThreshold: On failure, for the probe to be considered successful, it has to run successfully for this value. 支持的值为整数。默认值为 1,最小值为 1。The supported value is int. The default is 1 and the minimum is 1.

在任何时刻,最多只能有对一个容器的一个探测。There can be, at most, one probe to one container at any moment. 如果探测未在“timeoutSeconds”中设置的时间内完成,请等待并计算到达“failureThreshold”的时间 。If the probe doesn't finish in the time set in timeoutSeconds, wait and count the time toward the failureThreshold.

此外,Service Fabric 还会引发有关“DeployedServicePackage”的以下探测运行状况报告Additionally, Service Fabric will raise the following probe health reports on DeployedServicePackage:

  • OK:对于“successThreshold”中设置的值,探测成功。OK: The probe succeeds for the value set in successThreshold.

  • Error:容器重启之前,探测“failureCount” == “failureThreshold” 。Error: The probe failureCount == failureThreshold, before the container restarts.

  • WarningWarning:

    • 探测失败,“failureCount” < “failureThreshold” 。The probe fails and failureCount < failureThreshold. 此运行状况报告会一直进行下去,直至“failureCount”达到在“failureThreshold”或“successThreshold”中设置的值 。This health report stays until failureCount reaches the value set in failureThreshold or successThreshold.
    • 如果失败后成功,警告仍然存在,但会包含更新的续成功。On success after failure, the warning remains but with updated consecutive successes.

指定运行情况探测Specifying a liveness probe

可以在 ApplicationManifest.xml 文件中的“ServiceManifestImport”下指定探测。You can specify a probe in the ApplicationManifest.xml file under ServiceManifestImport.

探测可用于以下任一情况:The probe can be for any of the following:

  • HTTPHTTP
  • TCPTCP
  • ExecExec

HTTP 探测器HTTP probe

对于 HTTP 探测,Service Fabric 会向指定的端口和路径发送 HTTP 请求。For an HTTP probe, Service Fabric will send an HTTP request to the port and path that you specify. 返回代码大于或等于 200 且小于 400 表示成功。A return code that is greater than or equal to 200, and less than 400, indicates success.

下面是一个演示如何指定 HTTP 探测的示例:Here is an example of how to specify an HTTP probe:

  <ServiceManifestImport>
    <ServiceManifestRef ServiceManifestName="Stateless1Pkg" ServiceManifestVersion="1.0.0" />
    <ConfigOverrides />
    <Policies>
      <CodePackagePolicy CodePackageRef="Code">
        <Probes>
          <Probe Type="Liveness" FailureThreshold="5" SuccessThreshold="2" InitialDelaySeconds="10" PeriodSeconds="30" TimeoutSeconds="20">
            <HttpGet Path="/" Port="8081" Scheme="http">
              <HttpHeader Name="Foo" Value="Val"/>
              <HttpHeader Name="Bar" Value="val1"/>
            </HttpGet>
          </Probe>
        </Probes>
      </CodePackagePolicy>
    </Policies>
  </ServiceManifestImport>

HTTP 探测有其他可设置的属性:The HTTP probe has additional properties that you can set:

  • path:要在 HTTP 请求中使用的路径。path: The path to use in the HTTP request.

  • port:用于探测的端口。port: The port to use for probes. 此属性是必需的。This property is mandatory. 范围为 1 到 65535。The range is 1 to 65535.

  • scheme:用于连接到代码包的方案。scheme: The scheme to use for connecting to the code package. 如果将此属性设置为 HTTPS,则会跳过证书验证。If this property is set to HTTPS, the certificate verification is skipped. 默认设置为 HTTP。The default setting is HTTP.

  • httpHeader:要在请求中设置的标头。httpHeader: The headers to set in the request. 可以指定多个标头。You can specify multiple headers.

  • host:要连接到的主机 IP 地址。host: The host IP address to connect to.

TCP 探测器TCP probe

对于 TCP 探测,Service Fabric 会尝试使用指定的端口在容器上打开一个套接字。For a TCP probe, Service Fabric will try to open a socket on the container by using the specified port. 如果它可以建立连接,则探测会被视为成功。If it can establish a connection, the probe is considered successful. 下面的示例展示了如何指定使用 TCP 套接字的探测:Here's an example of how to specify a probe that uses a TCP socket:

  <ServiceManifestImport>
    <ServiceManifestRef ServiceManifestName="Stateless1Pkg" ServiceManifestVersion="1.0.0" />
    <ConfigOverrides />
    <Policies>
      <CodePackagePolicy CodePackageRef="Code">
        <Probes>
          <Probe Type="Liveness" FailureThreshold="5" SuccessThreshold="2" InitialDelaySeconds="10" PeriodSeconds="30" TimeoutSeconds="20">
            <TcpSocket Port="8081"/>
          </Probe>
        </Probes>
      </CodePackagePolicy>
    </Policies>
  </ServiceManifestImport>

Exec 探测Exec probe

此探测将向容器发出一个“exec”命令,并等待命令完成。This probe will issue an exec command into the container and wait for the command to finish.

备注

“Exec”命令接受以逗号分隔的字符串。Exec command takes a comma separated string. 以下示例中的命令适用于 Linux 容器。The command in the following example will work for a Linux container. 如果要探测 Windows 容器,请使用“cmd”。If you're trying to probe a Windows container, use cmd.

  <ServiceManifestImport>
    <ServiceManifestRef ServiceManifestName="Stateless1Pkg" ServiceManifestVersion="1.0.0" />
    <ConfigOverrides />
    <Policies>
      <CodePackagePolicy CodePackageRef="Code">
        <Probes>
          <Probe Type="Liveness" FailureThreshold="5" SuccessThreshold="2" InitialDelaySeconds="10" PeriodSeconds="30" TimeoutSeconds="20">
            <Exec>
              <Command>ping,-c,2,localhost</Command>
            </Exec>
          </Probe>        
       </Probes>
      </CodePackagePolicy>
    </Policies>
  </ServiceManifestImport>

后续步骤Next steps

请参阅以下文章,了解相关信息:See the following article for related information: