查询群集事件的 EventStore APIQuery EventStore APIs for cluster events

本文介绍如何查询适用于 Service Fabric 版本 6.2 及更高版本的 EventStore API - 如果想要了解有关 EventStore 服务的详细信息,请参阅 EventStore 服务概述This article covers how to query the EventStore APIs that are available in Service Fabric version 6.2 and later - if you would like to learn more about the EventStore service, see the EventStore service overview. 目前,EventStore 服务仅能访问最近 7 天的数据(这取决于群集的诊断数据保留策略)。Currently, the EventStore service can only access data for the last 7 days (this is based on your cluster's diagnostics data retention policy).

Note

自 Service Fabric 版本 6.4 起,EventStore API 是 GA(仅限在 Azure 上运行的 Windows 群集)。The EventStore APIs are GA as of Service Fabric version 6.4 for only Windows clusters running on Azure.

可以通过 REST 终结点或以编程方式直接访问 EventStore API。The EventStore APIs can be accessed directly via a REST endpoint, or programmatically. 需要几个参数才能收集正确的数据,具体取决于查询。Depending on the query, there are several parameters that are required to gather the right data. 这些参数通常包括:These parameters typically include:

  • api-version:正在使用的 EventStore API 版本api-version: the version of the EventStore APIs you are using
  • StartTimeUtc:定义要查找的开始时段StartTimeUtc: defines the start of the period you are interested in looking at
  • EndTimeUtc:结束时段EndTimeUtc: end of the time period

除了这些参数之外,还有一些可选参数,例如:In addition to these parameters, there are optional parameters available as well, such as:

  • timeout:替代默认的 60 秒超时,以执行请求操作timeout: override the default 60 second timeout for performing the request operation
  • eventstypesfilter:支持选择筛选特定事件类型eventstypesfilter: this gives you the option to filter for specific event types
  • ExcludeAnalysisEvents:不返回“分析”事件。ExcludeAnalysisEvents: do not return 'Analysis' events. 默认情况下,EventStore 查询将在可能的情况下返回“分析”事件。By default, EventStore queries will return with "analysis" events where possible. 分析事件是更丰富的操作通道事件,包括超出常规 Service Fabric 事件范围的其他上下文或信息并且更为深入。Analysis events are richer operational channel events that contain additional context or information beyond a regular Service Fabric event and provide more depth.
  • SkipCorrelationLookup:不在群集中查找潜在的关联事件。SkipCorrelationLookup: do not look for potential correlated events in the cluster. 默认情况下,EventStore 将尝试在群集中关联事件,并在可能的情况下将事件链接到一起。By default, the EventStore will attempt to correlate events across a cluster, and link your events together when possible.

可以查询群集中所有实体的事件。Each entity in a cluster can be queries for events. 还可以查询特定类型的所有实体的事件。You can also query for events for all entities of the type. 例如,可以查询特定节点的事件或群集中所有节点的事件。For example, you can query for events for a specific node, or for all nodes in your cluster. 当前可以查询的事件实体集(以及查询的结构)为:The current set of entities for which you can query for events is (with how the query would be structured):

  • 群集:/EventsStore/Cluster/EventsCluster: /EventsStore/Cluster/Events
  • 多个节点:/EventsStore/Nodes/EventsNodes: /EventsStore/Nodes/Events
  • 一个节点:/EventsStore/Nodes/<NodeName>/$/EventsNode: /EventsStore/Nodes/<NodeName>/$/Events
  • 多个应用程序:/EventsStore/Applications/EventsApplications: /EventsStore/Applications/Events
  • 一个应用程序:/EventsStore/Applications/<AppName>/$/EventsApplication: /EventsStore/Applications/<AppName>/$/Events
  • 多个服务:/EventsStore/Services/EventsServices: /EventsStore/Services/Events
  • 一个服务:/EventsStore/Services/<ServiceName>/$/EventsService: /EventsStore/Services/<ServiceName>/$/Events
  • 多个分区:/EventsStore/Partitions/EventsPartitions: /EventsStore/Partitions/Events
  • 一个分区:/EventsStore/Partitions/<PartitionID>/$/EventsPartition: /EventsStore/Partitions/<PartitionID>/$/Events
  • 多个副本:/EventsStore/Partitions/<PartitionID>/$/Replicas/EventsReplicas: /EventsStore/Partitions/<PartitionID>/$/Replicas/Events
  • 一个副本:/EventsStore/Partitions/<PartitionID>/$/Replicas/<ReplicaID>/$/EventsReplica: /EventsStore/Partitions/<PartitionID>/$/Replicas/<ReplicaID>/$/Events

Note

引用应用程序或服务名称时,查询不需要包括“fabric:/”前缀。When referencing an application or service name, the query doesn't need to include the "fabric:/" prefix. 另外,如果应用程序或服务名称中包含“/”,请将其转换为“~”,以使查询正常工作。Additionally, if your application or service names have a "/" in them, switch it to a "~" to keep the query working. 例如,如果应用程序显示为“fabric:/App1/FrontendApp”,则应用特定查询的结构应为 /EventsStore/Applications/App1~FrontendApp/$/EventsFor example, if your application shows up as "fabric:/App1/FrontendApp", your app specific queries would be structured as /EventsStore/Applications/App1~FrontendApp/$/Events. 此外,服务的运行状况报告现在在相应的应用程序下方显示,因此需要对正确的应用程序实体查询 DeployedServiceHealthReportCreated 事件。Additionally, health reports for services today show up under the corresponding application, so you would query for DeployedServiceHealthReportCreated events for the right application entity.

通过 REST API 终结点查询 EventStoreQuery the EventStore via REST API endpoints

GET 请求设为 <your cluster address>/EventsStore/<entity>/Events/ 后,可直接通过 REST 终结点查询 EventStore。You can query the EventStore directly via an REST endpoint, by making GET requests to: <your cluster address>/EventsStore/<entity>/Events/.

例如,若要查询 2018-04-03T18:00:00Z2018-04-04T18:00:00Z 之间的所有群集,请求将如下所示:For example, in order to query for all Cluster events between 2018-04-03T18:00:00Z and 2018-04-04T18:00:00Z, your request would look like:

Method: GET 
URL: http://mycluster:19080/EventsStore/Cluster/Events?api-version=6.4&StartTimeUtc=2018-04-03T18:00:00Z&EndTimeUtc=2018-04-04T18:00:00Z

此操作可以不返回任何事件或返回 json 中返回的事件列表:This could either return no events or the list of events returned in json:

Response: 200
Body:
[
  {
    "Kind": "ClusterUpgradeStart",
    "CurrentClusterVersion": "0.0.0.0:",
    "TargetClusterVersion": "6.2:1.0",
    "UpgradeType": "Rolling",
    "RollingUpgradeMode": "UnmonitoredAuto",
    "FailureAction": "Manual",
    "EventInstanceId": "090add3c-8f56-4d35-8d57-a855745b6064",
    "TimeStamp": "2018-04-03T20:18:59.4313064Z",
    "HasCorrelatedEvents": false
  },
  {
    "Kind": "ClusterUpgradeDomainComplete",
    "TargetClusterVersion": "6.2:1.0",
    "UpgradeState": "RollingForward",
    "UpgradeDomains": "(0 1 2)",
    "UpgradeDomainElapsedTimeInMs": "78.5288",
    "EventInstanceId": "090add3c-8f56-4d35-8d57-a855745b6064",
    "TimeStamp": "2018-04-03T20:19:59.5729953Z",
    "HasCorrelatedEvents": false
  },
  {
    "Kind": "ClusterUpgradeDomainComplete",
    "TargetClusterVersion": "6.2:1.0",
    "UpgradeState": "RollingForward",
    "UpgradeDomains": "(3 4)",
    "UpgradeDomainElapsedTimeInMs": "0",
    "EventInstanceId": "090add3c-8f56-4d35-8d57-a855745b6064",
    "TimeStamp": "2018-04-03T20:20:59.6271949Z",
    "HasCorrelatedEvents": false
  },
  {
    "Kind": "ClusterUpgradeComplete",
    "TargetClusterVersion": "6.2:1.0",
    "OverallUpgradeElapsedTimeInMs": "120196.5212",
    "EventInstanceId": "090add3c-8f56-4d35-8d57-a855745b6064",
    "TimeStamp": "2018-04-03T20:20:59.8134457Z",
    "HasCorrelatedEvents": false
  }
]

此处可以看到,在 2018-04-03T18:00:00Z2018-04-04T18:00:00Z 之间,该群集在首次尝试时成功完成首次升级,从 "CurrentClusterVersion": "0.0.0.0:" 升级到 "TargetClusterVersion": "6.2:1.0",所用时间为 "OverallUpgradeElapsedTimeInMs": "120196.5212"Here we can see that between 2018-04-03T18:00:00Z and 2018-04-04T18:00:00Z, this cluster successfully completed its first upgrade when it was first stood up, from "CurrentClusterVersion": "0.0.0.0:" to "TargetClusterVersion": "6.2:1.0", in "OverallUpgradeElapsedTimeInMs": "120196.5212".

以编程方式查询 EventStoreQuery the EventStore programmatically

还可以通过 Service Fabric 客户端库以编程方式查询 EventStore。You can also query the EventStore programmatically, via the Service Fabric client library.

设置好 Service Fabric 客户端后,可以通过访问 EventStore(如 sfhttpClient.EventStore.<request>)来查询事件Once you have your Service Fabric Client set up, you can query for events by accessing the EventStore like this: sfhttpClient.EventStore.<request>

以下是通过 GetClusterEventListAsync 函数请求 2018-04-03T18:00:00Z2018-04-04T18:00:00Z 之间的所有群集事件的示例。Here is an example request for all cluster events between 2018-04-03T18:00:00Z and 2018-04-04T18:00:00Z, via the GetClusterEventListAsync function.

var sfhttpClient = ServiceFabricClientFactory.Create(clusterUrl, settings);

var clstrEvents = sfhttpClient.EventsStore.GetClusterEventListAsync(
    "2018-04-03T18:00:00Z",
    "2018-04-04T18:00:00Z")
    .GetAwaiter()
    .GetResult()
    .ToList();

以下是另一个查询 2018 年 9 月群集运行状况和所有节点事件并将其打印出来的示例。Here is another example that queries for the cluster health and all node events in September 2018 and prints them out.

  const int timeoutSecs = 60;
  var clusterUrl = new Uri(@"http://localhost:19080"); // This example is for a Local cluster
  var sfhttpClient = ServiceFabricClientFactory.Create(clusterUrl);

  var clusterHealth = sfhttpClient.Cluster.GetClusterHealthAsync().GetAwaiter().GetResult();
  Console.WriteLine("Cluster Health: {0}", clusterHealth.AggregatedHealthState.Value.ToString());

  Console.WriteLine("Querying for node events...");
  var nodesEvents = sfhttpClient.EventsStore.GetNodesEventListAsync(
      "2018-09-01T00:00:00Z",
      "2018-09-30T23:59:59Z",
      timeoutSecs,
      "NodeDown,NodeUp")
      .GetAwaiter()
      .GetResult()
      .ToList();
  Console.WriteLine("Result Count: {0}", nodesEvents.Count());

  foreach (var nodeEvent in nodesEvents)
  {
      Console.Write("Node event happened at {0}, Node name: {1} ", nodeEvent.TimeStamp, nodeEvent.NodeName);
      if (nodeEvent is NodeDownEvent)
      {
          var nodeDownEvent = nodeEvent as NodeDownEvent;
          Console.WriteLine("(Node is down, and it was last up at {0})", nodeDownEvent.LastNodeUpAt);
      }
      else if (nodeEvent is NodeUpEvent)
      {
          var nodeUpEvent = nodeEvent as NodeUpEvent;
          Console.WriteLine("(Node is up, and it was last down at {0})", nodeUpEvent.LastNodeDownAt);
      }
  }

示例方案和查询Sample scenarios and queries

以下是说明如何通过调用事件存储 REST API 来了解群集状态的几个示例。Here are few examples on how you can call the Event Store REST APIs to understand the status of your cluster.

群集升级:Cluster upgrades:

若要查看上周群集最后一次成功或尝试升级的情况,可以通过查询 EventStore 中的“ClusterUpgradeCompleted”事件来查询 API 以查看群集最近完成的升级:https://mycluster.cloudapp.chinacloudapi.cn:19080/EventsStore/Cluster/Events?api-version=6.4&starttimeutc=2017-04-22T17:01:51Z&endtimeutc=2018-04-29T17:02:51Z&EventsTypesFilter=ClusterUpgradeCompletedTo see the last time your cluster was successfully or attempted to be upgraded last week, you can query the APIs for recently completed upgrades to your cluster, by querying for the "ClusterUpgradeCompleted" events in the EventStore: https://mycluster.cloudapp.chinacloudapi.cn:19080/EventsStore/Cluster/Events?api-version=6.4&starttimeutc=2017-04-22T17:01:51Z&endtimeutc=2018-04-29T17:02:51Z&EventsTypesFilter=ClusterUpgradeCompleted

群集升级问题:Cluster upgrade issues:

同样,如果最新的群集升级出现问题,则可以查询群集实体的所有事件。Similarly, if there were issues with a recent cluster upgrade, you could query for all events for the cluster entity. 将会看到各种事件,包括升级启动和成功完成升级的每个 UD。You'll see various events, including the initiation of upgrades and each UD for which the upgrade rolled through successfully. 还将看到回滚开始时的事件和相应的运行状况事件。You will also see events for the point at which the rollback started and corresponding health events. 以下查询可用于此种情况:https://mycluster.cloudapp.chinacloudapi.cn:19080/EventsStore/Cluster/Events?api-version=6.4&starttimeutc=2017-04-22T17:01:51Z&endtimeutc=2018-04-29T17:02:51ZHere's the query you would use for this: https://mycluster.cloudapp.chinacloudapi.cn:19080/EventsStore/Cluster/Events?api-version=6.4&starttimeutc=2017-04-22T17:01:51Z&endtimeutc=2018-04-29T17:02:51Z

节点状态更改:Node status changes:

若要查看过去几天的节点状态更改情况 - 节点上升或下降的时间,或者是处于激活或停用状态(由平台、混沌服务或用户输入导致)的时间 - 请使用以下查询:https://mycluster.cloudapp.chinacloudapi.cn:19080/EventsStore/Nodes/Events?api-version=6.4&starttimeutc=2017-04-22T17:01:51Z&endtimeutc=2018-04-29T17:02:51ZTo see your node status changes over the last few days - when nodes went up or down, or were activated or deactivated (either by the platform, the chaos service, or from user input) - use the following query: https://mycluster.cloudapp.chinacloudapi.cn:19080/EventsStore/Nodes/Events?api-version=6.4&starttimeutc=2017-04-22T17:01:51Z&endtimeutc=2018-04-29T17:02:51Z

应用程序事件:Application events:

还可以跟踪最近的应用程序部署和升级。You can also track your recent application deployments and upgrades. 使用以下查询,查看与群集中的所有应用程序事件:https://mycluster.cloudapp.chinacloudapi.cn:19080/EventsStore/Applications/Events?api-version=6.4&starttimeutc=2017-04-22T17:01:51Z&endtimeutc=2018-04-29T17:02:51ZUse the following query to see all application events in your cluster: https://mycluster.cloudapp.chinacloudapi.cn:19080/EventsStore/Applications/Events?api-version=6.4&starttimeutc=2017-04-22T17:01:51Z&endtimeutc=2018-04-29T17:02:51Z

应用程序的运行状况历史记录:Historical health for an application:

除了仅查看应用程序的生命周期事件,你可能还想查看特定应用程序运行状况的历史记录数据。In addition to just seeing application lifecycle events, you may also want to see historical data on the health of a specific application. 可通过指定想收集其数据的应用程序名称来执行此操作。You can do this by specifying the application name for which you want to gather the data. 使用此查询获取所有应用程序历史记录事件:https://mycluster.cloudapp.chinacloudapi.cn:19080/EventsStore/Applications/myApp/$/Events?api-version=6.4&starttimeutc=2018-03-24T17:01:51Z&endtimeutc=2018-03-29T17:02:51Z&EventsTypesFilter=ApplicationNewHealthReportUse this query to get all the application health events: https://mycluster.cloudapp.chinacloudapi.cn:19080/EventsStore/Applications/myApp/$/Events?api-version=6.4&starttimeutc=2018-03-24T17:01:51Z&endtimeutc=2018-03-29T17:02:51Z&EventsTypesFilter=ApplicationNewHealthReport. 如果要包括可能已过期的历史记录事件(已过保留时间 (TTL)),请将 ,ApplicationHealthReportExpired 添加到查询末尾以筛选两种类型的事件。If you want to include health events that may have expired (gone passed their time to live (TTL)), add ,ApplicationHealthReportExpired to the end of the query, to filter on two types of events.

“myApp”中所有服务的历史记录运行状况:Historical health for all services in "myApp":

目前,服务的运行状况报告事件在相应的应用程序实体下方显示为 DeployedServicePackageNewHealthReport 事件。Currently, health report events for services show up as DeployedServicePackageNewHealthReport events under the corresponding application entity. 若要查看服务如何对“App1”执行操作,请使用以下查询:https://winlrc-staging-10.chinaeast.cloudapp.chinacloudapi.cn:19080/EventsStore/Applications/myapp/$/Events?api-version=6.4&starttimeutc=2017-04-22T17:01:51Z&endtimeutc=2018-04-29T17:02:51Z&EventsTypesFilter=DeployedServicePackageNewHealthReportTo see how your services have been doing for "App1", use the following query: https://winlrc-staging-10.chinaeast.cloudapp.chinacloudapi.cn:19080/EventsStore/Applications/myapp/$/Events?api-version=6.4&starttimeutc=2017-04-22T17:01:51Z&endtimeutc=2018-04-29T17:02:51Z&EventsTypesFilter=DeployedServicePackageNewHealthReport

分区重新配置:Partition reconfiguration:

若要查看在群集中发生的所有分区移动,请查询 PartitionReconfigured 事件。To see all the partition movements that happened in your cluster, query for the PartitionReconfigured event. 这有助于在诊断群集中的问题时,找出在特定时间内哪些工作负载在哪些节点上运行。This can help you figure out what workloads ran on which node at specific times, when diagnosing issues in your cluster. 这是可执行此操作的示例查询:https://mycluster.cloudapp.chinacloudapi.cn:19080/EventsStore/Partitions/Events?api-version=6.4&starttimeutc=2018-04-22T17:01:51Z&endtimeutc=2018-04-29T17:02:51Z&EventsTypesFilter=PartitionReconfiguredHere's a sample query that does that: https://mycluster.cloudapp.chinacloudapi.cn:19080/EventsStore/Partitions/Events?api-version=6.4&starttimeutc=2018-04-22T17:01:51Z&endtimeutc=2018-04-29T17:02:51Z&EventsTypesFilter=PartitionReconfigured

混沌服务:Chaos service:

当混沌服务开始或停止时,群集级别会公开一个事件。There is an event for when the Chaos service is started or stopped that is exposed at the cluster level. 若要查看最近一次使用混沌服务的情况,请使用以下查询:https://mycluster.cloudapp.chinacloudapi.cn:19080/EventsStore/Cluster/Events?api-version=6.4&starttimeutc=2017-04-22T17:01:51Z&endtimeutc=2018-04-29T17:02:51Z&EventsTypesFilter=ChaosStarted,ChaosStoppedTo see your recent use of the Chaos service, use the following query: https://mycluster.cloudapp.chinacloudapi.cn:19080/EventsStore/Cluster/Events?api-version=6.4&starttimeutc=2017-04-22T17:01:51Z&endtimeutc=2018-04-29T17:02:51Z&EventsTypesFilter=ChaosStarted,ChaosStopped