Reliable Actors 的诊断和性能监视Diagnostics and performance monitoring for Reliable Actors

Reliable Actors 运行时发出 EventSource 事件和性能计数器The Reliable Actors runtime emits EventSource events and performance counters. 这些有助于深入了解运行时的运行状况以及进行故障排除和性能监视。These provide insights into how the runtime is operating and help with troubleshooting and performance monitoring.

EventSource 事件EventSource events

Reliable Actors 运行时的 EventSource 提供程序名称为“Microsoft-ServiceFabric-Actors”。The EventSource provider name for the Reliable Actors runtime is "Microsoft-ServiceFabric-Actors". 在 Visual Studio 中调试执行组件应用程序时,来自此事件源的事件显示在“诊断事件”窗口中。Events from this event source appear in the Diagnostics Events window when the actor application is being debugged in Visual Studio.

有助于收集和/或查看 EventSource 事件的工具和技术示例包括 PerfViewAzure 诊断语义日志记录Microsoft TraceEvent 库Examples of tools and technologies that help in collecting and/or viewing EventSource events are PerfView, Azure Diagnostics, Semantic Logging, and the Microsoft TraceEvent Library.

关键字Keywords

属于 Reliable Actors EventSource 的所有事件都与一个或多个关键字相关联。All events that belong to the Reliable Actors EventSource are associated with one or more keywords. 这样能够对收集的事件进行筛选。This enables filtering of events that are collected. 定义了以下关键字位。The following keyword bits are defined.

Bit 说明Description
0x10x1 汇总 Fabric 执行组件运行时的操作的重要事件集。Set of important events that summarize the operation of the Fabric Actors runtime.
0x20x2 描述执行组件方法调用的事件集。Set of events that describe actor method calls. 有关详细信息,请参阅执行组件主题简介For more information, see the introductory topic on actors.
0x40x4 与执行组件状态相关的事件集。Set of events related to actor state. 有关详细信息,请参阅有关执行组件状态管理的主题。For more information, see the topic on actor state management.
0x80x8 与执行组件中基于轮次的并发相关的事件集。Set of events related to turn-based concurrency in the actor. 有关详细信息,请参阅有关并发的主题。For more information, see the topic on concurrency.

性能计数器Performance counters

Reliable Actors 运行时定义以下性能计数器类别。The Reliable Actors runtime defines the following performance counter categories.

CategoryCategory 说明Description
Service Fabric 执行组件Service Fabric Actor 特定于 Azure Service Fabric 执行组件的计数器,例如保存执行组件状态所用的时间。Counters specific to Azure Service Fabric actors, e.g. time taken to save actor state
Service Fabric 执行组件方法Service Fabric Actor Method 特定于由 Service Fabric 执行组件实现的方法的计数器,例如调用执行组件方法的频率。Counters specific to methods implemented by Service Fabric actors, e.g. how often an actor method is invoked

以上每个类别都有一个或多个计数器。Each of the above categories has one or more counters.

默认情况下在 Windows 操作系统中提供的 Windows 性能监视器 应用程序可用于收集和查看性能计数器数据。The Windows Performance Monitor application that is available by default in the Windows operating system can be used to collect and view performance counter data. Azure 诊断是用于收集性能计数器数据并将其上传到 Azure 表的另一个选项。Azure Diagnostics is another option for collecting performance counter data and uploading it to Azure tables.

性能计数器实例名称Performance counter instance names

具有大量执行组件服务或执行组件服务分区的群集具有大量执行组件性能计数器实例。A cluster that has a large number of actor services or actor service partitions will have a large number of actor performance counter instances. 性能计数器实例名称有助于标识与性能计数器实例相关联的特定分区和执行组件方法(如果适用)。The performance counter instance names can help in identifying the specific partition and actor method (if applicable) that the performance counter instance is associated with.

Service Fabric 执行组件类别Service Fabric Actor category

对于类别 Service Fabric Actor,计数器实例名称采用以下格式:For the category Service Fabric Actor, the counter instance names are in the following format:

ServiceFabricPartitionID_ActorsRuntimeInternalID

ServiceFabricPartitionID 是与性能计数器实例关联的 Service Fabric 分区 ID 的字符串表示形式。ServiceFabricPartitionID is the string representation of the Service Fabric partition ID that the performance counter instance is associated with. 分区 ID 是 GUID,并且其字符串表示形式通过使用格式说明符“D”的 Guid.ToString 方法生成。The partition ID is a GUID, and its string representation is generated through the Guid.ToString method with format specifier "D".

ActorRuntimeInternalID 是由 Fabric 执行组件运行时生成的供其内部使用的 64 位整数的字符串表示形式。ActorRuntimeInternalID is the string representation of a 64-bit integer that is generated by the Fabric Actors runtime for its internal use. 这包括在性能计数器实例名称中,以确保其唯一性并避免与其他性能计数器实例名称发生冲突。This is included in the performance counter instance name to ensure its uniqueness and avoid conflict with other performance counter instance names. 用户不应尝试解释此部分的性能计数器实例名称。Users should not try to interpret this portion of the performance counter instance name.

下面是属于 Service Fabric Actor 类别的计数器的计数器实例名称的示例:The following is an example of a counter instance name for a counter that belongs to the Service Fabric Actor category:

2740af29-78aa-44bc-a20b-7e60fb783264_635650083799324046

上例中,2740af29-78aa-44bc-a20b-7e60fb783264 是 Service Fabric 分区 ID 的字符串表示,635650083799324046 是运行时生成的供内部使用的 64 位 ID。In the example above, 2740af29-78aa-44bc-a20b-7e60fb783264 is the string representation of the Service Fabric partition ID, and 635650083799324046 is the 64-bit ID that is generated for the runtime's internal use.

Service Fabric 执行组件方法类别Service Fabric Actor Method category

对于类别 Service Fabric Actor Method,计数器实例名称采用以下格式:For the category Service Fabric Actor Method, the counter instance names are in the following format:

MethodName_ActorsRuntimeMethodId_ServiceFabricPartitionID_ActorsRuntimeInternalID

MethodName 是与性能计数器实例关联的执行组件方法的名称。MethodName is the name of the actor method that the performance counter instance is associated with. 方法名称的格式根据 Fabric 执行组件运行时中的一些逻辑确定,该逻辑可以平衡名称的可读性和 Windows 上对性能计数器实例名称的最大长度的约束。The format of the method name is determined based on some logic in the Fabric Actors runtime that balances the readability of the name with constraints on the maximum length of the performance counter instance names on Windows.

ActorsRuntimeMethodId 是由 Fabric 执行组件运行时生成的供其内部使用的 32 位整数的字符串表示形式。ActorsRuntimeMethodId is the string representation of a 32-bit integer that is generated by the Fabric Actors runtime for its internal use. 这包括在性能计数器实例名称中,以确保其唯一性并避免与其他性能计数器实例名称发生冲突。This is included in the performance counter instance name to ensure its uniqueness and avoid conflict with other performance counter instance names. 用户不应尝试解释此部分的性能计数器实例名称。Users should not try to interpret this portion of the performance counter instance name.

ServiceFabricPartitionID 是与性能计数器实例关联的 Service Fabric 分区 ID 的字符串表示形式。ServiceFabricPartitionID is the string representation of the Service Fabric partition ID that the performance counter instance is associated with. 分区 ID 是 GUID,并且其字符串表示形式通过使用格式说明符“D”的 Guid.ToString 方法生成。The partition ID is a GUID, and its string representation is generated through the Guid.ToString method with format specifier "D".

ActorRuntimeInternalID 是由 Fabric 执行组件运行时生成的供其内部使用的 64 位整数的字符串表示形式。ActorRuntimeInternalID is the string representation of a 64-bit integer that is generated by the Fabric Actors runtime for its internal use. 这包括在性能计数器实例名称中,以确保其唯一性并避免与其他性能计数器实例名称发生冲突。This is included in the performance counter instance name to ensure its uniqueness and avoid conflict with other performance counter instance names. 用户不应尝试解释此部分的性能计数器实例名称。Users should not try to interpret this portion of the performance counter instance name.

下面是属于 Service Fabric Actor Method 类别的计数器的计数器实例名称的示例:The following is an example of a counter instance name for a counter that belongs to the Service Fabric Actor Method category:

ivoicemailboxactor.leavemessageasync_2_89383d32-e57e-4a9b-a6ad-57c6792aa521_635650083804480486

上例中,ivoicemailboxactor.leavemessageasync 是方法名称,2 是运行时生成的供内部使用的 32 位 ID,89383d32-e57e-4a9b-a6ad-57c6792aa521 是 Service Fabric 分区 ID 的字符串表示,635650083804480486 是运行时生成的供内部使用的 64 位 ID。In the example above, ivoicemailboxactor.leavemessageasync is the method name, 2 is the 32-bit ID generated for the runtime's internal use, 89383d32-e57e-4a9b-a6ad-57c6792aa521 is the string representation of the Service Fabric partition ID, and 635650083804480486 is the 64-bit ID generated for the runtime's internal use.

事件和性能计数器的列表List of events and performance counters

执行组件方法事件和性能计数器Actor method events and performance counters

Reliable Actors 运行时发出以下与执行组件方法相关的事件。The Reliable Actors runtime emits the following events related to actor methods.

事件名称Event name 事件 IDEvent ID LevelLevel 关键字Keyword 说明Description
ActorMethodStartActorMethodStart 77 详细Verbose 0x20x2 执行组件运行时即将调用执行组件方法。Actors runtime is about to invoke an actor method.
ActorMethodStopActorMethodStop 88 详细Verbose 0x20x2 执行组件方法已执行完毕。An actor method has finished executing. 也就是说,已返回运行时的对执行组件方法的异步调用,此执行组件方法返回的任务已完成。That is, the runtime's asynchronous call to the actor method has returned, and the task returned by the actor method has finished.
ActorMethodThrewExceptionActorMethodThrewException 99 警告Warning 0x30x3 在执行执行组件方法时,即在运行时的对执行组件方法的异步调用过程中或在执行执行组件方法返回的任务过程中引发异常。An exception was thrown during the execution of an actor method, either during the runtime's asynchronous call to the actor method or during the execution of the task returned by the actor method. 此事件表示执行组件代码中出现某种形式的故障,需要调查。This event indicates some sort of failure in the actor code that needs investigation.

Reliable Actors 运行时发布与执行执行组件方法相关的以下性能计数器。The Reliable Actors runtime publishes the following performance counters related to the execution of actor methods.

类别名称Category name 计数器名称Counter name 说明Description
Service Fabric 执行组件方法Service Fabric Actor Method 调用/秒Invocations/Sec 每秒调用执行组件服务方法的次数Number of times that the actor service method is invoked per second
Service Fabric 执行组件方法Service Fabric Actor Method 每次调用的平均毫秒数Average milliseconds per invocation 执行执行组件服务方法所用的时间(以毫秒为单位)Time taken to execute the actor service method in milliseconds
Service Fabric 执行组件方法Service Fabric Actor Method 引发的异常数/秒Exceptions thrown/Sec 执行组件服务方法每秒引发异常的次数Number of times that the actor service method threw an exception per second

并发事件和性能计数器Concurrency events and performance counters

Reliable Actors 运行时发出以下与并发相关的事件。The Reliable Actors runtime emits the following events related to concurrency.

事件名称Event name 事件 IDEvent ID LevelLevel 关键字Keyword 说明Description
ActorMethodCallsWaitingForLockActorMethodCallsWaitingForLock 1212 详细Verbose 0x80x8 在执行组件中每次新的轮次开始时写入此事件。This event is written at the start of each new turn in an actor. 其中包含挂起的执行组件调用数。这些调用稍后将获取用于强制执行基于轮次的并发的每执行组件锁定。It contains the number of pending actor calls that are waiting to acquire the per-actor lock that enforces turn-based concurrency.

Reliable Actors 运行时发布与并发相关的以下性能计数器。The Reliable Actors runtime publishes the following performance counters related to concurrency.

类别名称Category name 计数器名称Counter name 说明Description
Service Fabric 执行组件Service Fabric Actor 等待执行组件锁的执行组件调用次数# of actor calls waiting for actor lock 等待获取强制执行基于轮次的并发的每个执行组件锁的待处理执行组件调用次数。Number of pending actor calls waiting to acquire the per-actor lock that enforces turn-based concurrency
Service Fabric 执行组件Service Fabric Actor 每个锁等待的平均毫秒数Average milliseconds per lock wait 获取强制执行基于轮次的并发的每个执行组件锁所用的时间(以毫秒为单位)Time taken (in milliseconds) to acquire the per-actor lock that enforces turn-based concurrency
Service Fabric 执行组件Service Fabric Actor 持有执行组件锁的平均毫秒数Average milliseconds actor lock held 持有每个执行组件锁的时间(以毫秒为单位)Time (in milliseconds) for which the per-actor lock is held

执行组件状态管理事件和性能计数器Actor state management events and performance counters

Reliable Actors 运行时发出以下与执行组件状态管理相关的事件。The Reliable Actors runtime emits the following events related to actor state management.

事件名称Event name 事件 IDEvent ID LevelLevel 关键字Keyword 说明Description
ActorSaveStateStartActorSaveStateStart 10 个10 详细Verbose 0x40x4 执行组件运行时即将保存执行组件状态。Actors runtime is about to save the actor state.
ActorSaveStateStopActorSaveStateStop 1111 详细Verbose 0x40x4 执行组件运行时已完成保存执行组件状态。Actors runtime has finished saving the actor state.

Reliable Actors 运行时发布与执行组件状态管理相关的以下性能计数器。The Reliable Actors runtime publishes the following performance counters related to actor state management.

类别名称Category name 计数器名称Counter name 说明Description
Service Fabric 执行组件Service Fabric Actor 每个保存状态操作的平均毫秒数Average milliseconds per save state operation 保存执行组件状态所用的时间(以毫秒为单位)Time taken to save actor state in milliseconds
Service Fabric 执行组件Service Fabric Actor 每个加载状态操作的平均毫秒数Average milliseconds per load state operation 加载执行组件状态所用的时间(以毫秒为单位)Time taken to load actor state in milliseconds

Reliable Actors 运行时发出以下与执行组件副本相关的事件。The Reliable Actors runtime emits the following events related to actor replicas.

事件名称Event name 事件 IDEvent ID LevelLevel 关键字Keyword 说明Description
ReplicaChangeRoleToPrimaryReplicaChangeRoleToPrimary 11 信息性Informational 0x10x1 执行组件副本将角色更改为“主要”。Actor replica changed role to Primary. 这意味着在此副本内创建此分区的执行组件。This implies that the actors for this partition will be created inside this replica.
ReplicaChangeRoleFromPrimaryReplicaChangeRoleFromPrimary 22 信息性Informational 0x10x1 执行组件副本将角色更改为“非主要”。Actor replica changed role to non-Primary. 这意味着不再在此副本内创建此分区的执行组件。This implies that the actors for this partition will no longer be created inside this replica. 不会将任何新请求传送到此副本中已创建的执行组件。No new requests will be delivered to actors already created within this replica. 完成正在进行中的任何请求后,会销毁执行组件。The actors will be destroyed after any in-progress requests are completed.

执行组件激活和停用事件以及性能计数器Actor activation and deactivation events and performance counters

Reliable Actors 运行时发出以下与执行组件激活和停用相关的事件。The Reliable Actors runtime emits the following events related to actor activation and deactivation.

事件名称Event name 事件 IDEvent ID LevelLevel 关键字Keyword 说明Description
ActorActivatedActorActivated 55 信息性Informational 0x10x1 执行组件已激活。An actor has been activated.
ActorDeactivatedActorDeactivated 66 信息性Informational 0x10x1 执行组件已停用。An actor has been deactivated.

Reliable Actors 运行时发布与执行组件激活和停用相关的以下性能计数器。The Reliable Actors runtime publishes the following performance counters related to actor activation and deactivation.

类别名称Category name 计数器名称Counter name 说明Description
Service Fabric 执行组件Service Fabric Actor OnActivateAsync 平均毫秒数Average OnActivateAsync milliseconds 执行 OnActivateAsync 方法所用的时间(以毫秒为单位)Time taken to execute OnActivateAsync method in milliseconds

执行组件请求处理的性能计数器Actor request processing performance counters

当客户端通过执行组件代理对象调用方法时,会通过网络向执行组件服务发送请求消息。When a client invokes a method via an actor proxy object, it results in a request message being sent over the network to the actor service. 该服务处理此请求消息并向客户端返回响应。The service processes the request message and sends a response back to the client. Reliable Actors 运行时发布与执行组件请求处理相关的以下性能计数器。The Reliable Actors runtime publishes the following performance counters related to actor request processing.

类别名称Category name 计数器名称Counter name 说明Description
Service Fabric 执行组件Service Fabric Actor 未完成的请求数# of outstanding requests 正在服务中处理的请求数Number of requests being processed in the service
Service Fabric 执行组件Service Fabric Actor 每个请求的平均毫秒数Average milliseconds per request 服务处理请求所用时间(以毫秒为单位)Time taken (in milliseconds) by the service to process a request
Service Fabric 执行组件Service Fabric Actor 反序列化请求的平均毫秒数Average milliseconds for request deserialization 当服务收到执行组件请求消息时对此请求消息进行反序列化所用的时间(以毫秒为单位)Time taken (in milliseconds) to deserialize actor request message when it is received at the service
Service Fabric 执行组件Service Fabric Actor 序列化响应的平均毫秒数Average milliseconds for response serialization 在将响应发送到客户端之前,在服务中序列化执行组件响应消息所用的时间(以毫秒为单位)Time taken (in milliseconds) to serialize the actor response message at the service before the response is sent to the client

后续步骤Next steps