EventStore 概述EventStore Overview

Note

从 Service Fabric 版本 6.4 开始。As of Service Fabric version 6.4. EventStore API 仅可用于在 Azure 上运行的 Windows 群集。the EventStore APIs are only available for Windows clusters running on Azure only. 我们正在将此功能移植到 Linux 以及我们的独立群集。We are working on porting this functionality to Linux as well as our Standalone clusters.

概述Overview

EventStore 服务是从版本 6.2 中引入的,它是 Service Fabric 中的监视选项。Introduced in version 6.2, the EventStore service is a monitoring option in Service Fabric. EventStore 提供了在给定时间点中了解群集或工作负载的状态的方法。EventStore provides a way to understand the state of your cluster or workloads at a given point in time. EventStore 是有状态 Service Fabric 服务,它维护群集中的事件。The EventStore is a stateful Service Fabric service that maintains events from the cluster. 事件通过 Service Fabric Explorer、REST 和 API 公开。The event are exposed through the Service Fabric Explorer, REST and APIs. EventStore 直接查询群集来获取关于群集中的任何实体的诊断数据,并且应当用来帮助执行以下操作:EventStore queries the cluster directly to get diagnostics data on any entity in your cluster and should be used to help:

  • 在开发或测试时或者当可能使用监视管道时对问题进行诊断Diagnose issues in development or testing, or where you might be using a monitoring pipeline
  • 确认正在正确处理对群集执行的管理操作Confirm that management actions you are taking on your cluster are being processed correctly
  • 获取 Service Fabric 如何与特定实体进行交互的“快照”Get a "snapshot" of how Service Fabric is interacting with a particular entity

EventStore

若要查看 EventStore 中可用的事件的完整列表,请参阅 Service Fabric 事件To see a full list of events available in the EventStore, see Service Fabric events.

Note

从 Service Fabric 版本 6.4 开始。As of Service Fabric version 6.4. EventStore API 和 UX 已正式发布,适用于 Azure Windows 群集。the EventStore APIs and UX are generally available for Azure Windows clusters. 我们正在将此功能移植到 Linux 以及我们的独立群集。We are working on porting this functionality to Linux as well as our Standalone clusters.

可以向 EventStore 服务查询可用于群集中的每个实体和实体类型的事件。The EventStore service can be queried for events that are available for each entity and entity type in your cluster. 这意味着可以在以下级别查询事件:This means you can query for events on the following levels:

  • 群集:特定于群集本身的事件(例如群集升级)Cluster: events specific to the cluster itself (e.g. cluster upgrade)
  • 多节点:所有节点级事件Nodes: all node level events
  • 节点:特定于一个节点的事件,由 nodeName 标识Node: events specific to one node, identified by nodeName
  • 多应用程序:所有应用程序级事件Applications: all application level events
  • 应用程序:特定于一个应用程序的事件,由 applicationId 标识Application: events specific to one application identified by applicationId
  • 多服务:来自群集中所有服务的事件Services: events from all services in your clusters
  • 服务:来自特定服务的事件,由 serviceId 标识Service: events from a specific service identified by serviceId
  • 多分区:来自所有分区的事件Partitions: events from all partitions
  • 分区:来自特定分区的事件,由 partitionId 标识Partition: events from a specific partition identified by partitionId
  • 分区副本:来自所有副本的事件/特定分区中的实例,由 partitionId 标识Partition Replicas: events from all replicas / instances within a specific partition identified by partitionId
  • 分区副本:来自特定副本的事件/实例,由 replicaIdpartitionId 标识Partition Replica: events from a specific replica / instance identified by replicaId and partitionId

若要了解有关 API 的详细信息,请查看 EventStore API 参考To learn more about the API check out the EventStore API reference.

EventStore 服务还能够将群集中的事件相关联。The EventStore service also has the ability to correlate events in your cluster. 通过查看在同一时间从可能已相互影响的不同实体写入的事件,EventStore 服务能够将这些事件进行关联来帮助查明群集中发生各项活动的原因。By looking at events that were written at the same time from different entities that may have impacted each other, the EventStore service is able to link these events to help with identifying causes for activities in your cluster. 例如,如果某个应用程序变得不正常且没有诱发任何变化,则 EventStore 将查看由平台公开的其他事件并且可能会将此情况与 ErrorWarning 事件相关联。For example, if one of your applications happens to become unhealthy without any induced changes, the EventStore will also look at other events exposed by the platform and could correlate this with an Error or Warning event. 这有助于更快地进行故障检测和根本原因分析。This helps with faster failure detection and root causes analysis.

在群集上启用 EventStoreEnable EventStore on your cluster

本地群集Local Cluster

群集中的 fabricSettings.json 中,添加 EventStoreService 作为 addOn 功能,并执行群集升级。In fabricSettings.json in your cluster, add EventStoreService as an addOn feature and perform a cluster upgrade.

    "addOnFeatures": [
        "EventStoreService"
    ],

Azure 群集版本 6.5+Azure cluster version 6.5+

如果 Azure 群集已升级到版本 6.5 或更高版本,则 EventStore 将在该群集上自动启用。If your Azure cluster gets upgraded to version 6.5 or higher, EventStore will be automatically enabled on your cluster. 若要选择退出,需要执行以下操作更新群集模板:To opt out, you need to update your cluster template with the following:

  • 使用 2019-03-01 的 API 版本或更高版本Use an API version of 2019-03-01 or newer
  • 将以下代码添加到群集中的属性部分Add the following code to your properties section in your cluster
    "fabricSettings": [
      …
    ],
    "eventStoreServiceEnabled": false
    

Azure 群集版本 6.4Azure cluster version 6.4

如果使用的是版本 6.4,则可以编辑 Azure 资源管理器模板,以启用 EventStore 服务。If you are using version 6.4, you can edit your Azure Resource Manager template to turn on EventStore service. 此操作是通过执行群集配置升级和添加以下代码来完成的。你可以使用 PlacementConstraints 将 EventStore 服务的副本放置在特定的 NodeType(例如专用于系统服务的 NodeType)上。This is done by performing a cluster config upgrade and adding the following code, you can use PlacementConstraints to put the replicas of the EventStore service on a specific NodeType e.g. a NodeType dedicated for the system services. upgradeDescription 部分配置配置升级,以触发节点上的重新启动。The upgradeDescription section configures the config upgrade to trigger a restart on the nodes. 可以在其他更新中删除该部分。You can remove the section in another update.

    "fabricSettings": [
          …
          …
          …,
         {
            "name": "EventStoreService",
            "parameters": [
              {
                "name": "TargetReplicaSetSize",
                "value": "3"
              },
              {
                "name": "MinReplicaSetSize",
                "value": "1"
              },
              {
                "name": "PlacementConstraints",
                "value": "(NodeType==<node_type_name_here>)"
              }
            ]
          }
        ],
        "upgradeDescription": {
          "forceRestart": true,
          "upgradeReplicaSetCheckTimeout": "10675199.02:48:05.4775807",
          "healthCheckWaitDuration": "00:01:00",
          "healthCheckStableDuration": "00:01:00",
          "healthCheckRetryTimeout": "00:5:00",
          "upgradeTimeout": "1:00:00",
          "upgradeDomainTimeout": "00:10:00",
          "healthPolicy": {
            "maxPercentUnhealthyNodes": 100,
            "maxPercentUnhealthyApplications": 100
          },
          "deltaHealthPolicy": {
            "maxPercentDeltaUnhealthyNodes": 0,
            "maxPercentUpgradeDomainDeltaUnhealthyNodes": 0,
            "maxPercentDeltaUnhealthyApplications": 0
          }
        }

后续步骤Next steps