Reliable Services 概述Reliable Services overview

Azure Service Fabric 可简化无状态和有状态服务的编写与管理。Azure Service Fabric simplifies writing and managing stateless and stateful services. 本主题的内容:This topic covers:

  • 无状态服务和有状态服务的 Reliable Services 编程模型。The Reliable Services programming model for stateless and stateful services.
  • 编写 Reliable Services 时必须做出选择。The choices you have to make when writing a Reliable Service.
  • 有关 Reliable Services 的使用方案及编写方式的一些方案和示例。Some scenarios and examples of when to use Reliable Services and how they are written.

Reliable Services 是 Service Fabric 上可用的编程模型之一。Reliable Services is one of the programming models available on Service Fabric. 另一个编程模型是 Reliable Actor,它在 Reliable Services 模型的顶层提供 Virtual Actor 应用程序框架。Another is the Reliable Actor programming model, which provides a Virtual Actor application framework on top of the Reliable Services model. 有关 Reliable Actors 的详细信息,请参阅 Service Fabric Reliable Actors 简介For more information on Reliable Actors, see Introduction to Service Fabric Reliable Actors.

Service Fabric 通过 Service Fabric 应用程序管理管理服务的生存期,从预配和部署一直到升级和删除。Service Fabric manages the lifetime of services, from provisioning and deployment through upgrade and deletion, via Service Fabric application management.

什么是 Reliable ServicesWhat are Reliable Services

Reliable Services 可提供简单且功能强大的顶级编程模型,以便帮助用户表达对其应用程序至关重要的内容。Reliable Services gives you a simple, powerful, top-level programming model to help you express what is important to your application. 借助 Reliable Services 编程模型有以下益处:With the Reliable Services programming model, you get:

  • 访问 Service Fabric API。Access to Service Fabric APIs. 与建模为来宾可执行文件的 Service Fabric 服务不同,Reliable Services 可以直接使用 Service Fabric API。Unlike Service Fabric services modeled as Guest Executables, Reliable Services can use Service Fabric APIs directly. 这样,服务便可以:This allows services to:
    • 查询系统Query the system
    • 报告群集中实体的运行状况Report health about entities in the cluster
    • 接收有关配置和代码更改的通知Receive notifications about configuration and code changes
    • 查找其他服务并与它们通信Find and communicate with other services,
    • 使用 Reliable CollectionsUse the Reliable Collections
    • 访问其他许多功能,所有这些操作都可通过使用多种编程语言编写的一流编程模型执行。Access many other capabilities, all from a first-class programming model in several programming languages.
  • 类似于你所熟悉的编程模型的简单模型,用于运行自己的代码。A simple model for running your own code that feels like other familiar programming models. 代码具有定义完善的入口点和易于管理的生命周期。Your code has a well-defined entry point and easily managed lifecycle.
  • 可插式通信模型。A pluggable communication model. 使用选择的传输方式,如包含 Web API 的 HTTP、WebSockets、自定义 TCP 协议,等等。Use the transport of your choice, such as HTTP with Web API, WebSockets, custom TCP protocols, or anything else. Reliable Services 提供一些极佳的自带选项供选用,但也可以提供自己的选项。Reliable Services provide some great out-of-the-box options you can use, or you can provide your own.
  • 对于有状态服务,Reliable Services 编程模型允许使用 Reliable Collections 直接在服务内以一致、可靠的方式存储状态。For stateful services, the Reliable Services programming model allows you to consistently and reliably store your state right inside your service by using Reliable Collections. Reliable Collections 是一组简单的高度可用、可靠集合类,用过 C# 集合的用户都对它很熟悉。Reliable Collections are a simple set of highly available and reliable collection classes that will be familiar to anyone who has used C# collections. 按照惯例,服务需借助外部系统来进行可靠的状态管理。Traditionally, services needed external systems for Reliable state management. 利用 Reliable Collections,可将状态存储在计算旁边,获得高可用性外部存储一样的高可用性和可靠性。With Reliable Collections, you can store your state next to your compute with the same high availability and reliability you've come to expect from highly available external stores. 此模型还能改善延迟问题,因为可将运行此模型所需的计算资源与状态放置在一起。This model also improves latency because you are co-locating the compute and state it needs to function.

Reliable Services 有何不同之处What makes Reliable Services different

Reliable Services 与你以前编写的服务不同,因为 Service Fabric 提供:Reliable Services are different from services you may have written before, because Service Fabric provides:

  • 可靠性 - 即使在不可靠的环境中(计算机可能出现故障或遇到网络问题),或者在服务本身遇到错误和崩溃或故障的情况下,服务也仍能正常运行。Reliability - Your service stays up even in unreliable environments where your machines fail or hit network issues, or in cases where the services themselves encounter errors and crash or fail. 对于有状态服务,即使遇到网络故障或其他故障,状态仍会得到保留。For stateful services, your state is preserved even in the presence of network or other failures.
  • 可用性 - 服务可供访问,保持响应能力。Availability - Your service is reachable and responsive. Service Fabric 将保留所需数目的运行副本。Service Fabric maintains your desired number of running copies.
  • 可伸缩性 - 服务与特定硬件分离,可根据需要通过添加或删除硬件或其他资源来增大或缩小。Scalability - Services are decoupled from specific hardware, and they can grow or shrink as necessary through the addition or removal of hardware or other resources. 轻松将服务分区(尤其是在有状态的情况下),确保服务能够缩放,处理部分故障。Services are easily partitioned (especially in the stateful case) to ensure that the service can scale and handle partial failures. 可以通过代码动态创建和删除,在必要的情况下运转多个实例,例如,对客户请求做出响应。Services can be created and deleted dynamically via code, enabling more instances to be spun up as necessary, for example in response to customer requests. 最后,Service Fabric 鼓励轻量级服务。Finally, Service Fabric encourages services to be lightweight. Service Fabric 允许在单个进程内预配数千个服务,而不要求将整个 OS 实例或进程专用于服务的单个实例。Service Fabric allows thousands of services to be provisioned within a single process, rather than requiring or dedicating entire OS instances or processes to a single instance of a service.
  • 一致性 - 保证 Reliable Services 中存储的任何信息都是一致的。Consistency - Any information stored in a Reliable Service can be guaranteed to be consistent. 即使是在服务中的多个 Reliable Collections 之间,这一点同样适用。This is true even across multiple Reliable Collections within a service. 对服务中的集合所做的更改可通过事务上不可部分完成的方式进行。Changes across collections within a service can be made in a transactionally atomic manner.

服务生命周期Service lifecycle

无论服务有状态还是无状态,Reliable Services 都会提供简单的生命周期,以便快速插入代码并开始执行。Whether your service is stateful or stateless, Reliable Services provide a simple lifecycle that lets you quickly plug in your code and get started. 启动并运行新服务需要实现两个方法:Getting a new service up and running requires you to implement two methods:

  • CreateServiceReplicaListeners/CreateServiceInstanceListeners - 服务在此方法中定义要使用的通信堆栈。CreateServiceReplicaListeners/CreateServiceInstanceListeners - This method is where the service defines the communication stack(s) that it wants to use. 通信堆栈(如 Web API)可定义服务的一个或多个侦听终结点(客户端如何访问服务)。The communication stack, such as Web API, is what defines the listening endpoint or endpoints for the service (how clients reach the service). 它还定义所显示的消息如何与服务代码的其余部分交互。It also defines how the messages that appear interact with the rest of the service code.
  • RunAsync - 服务在此方法中运行其业务逻辑,对于在服务生存期内一直运行的所有后台任务,服务可在此方法中启动这些任务。RunAsync - This method is where your service runs its business logic, and where it would kick off any background tasks that should run for the lifetime of the service. 所提供的取消标记是指示该操作何时应停止的信号。The cancellation token that is provided is a signal for when that work should stop. 例如,如果服务需要从 Reliable Queue 中提取消息并进行处理,这就是这些工作的发生位置。For example, if the service needs to pull messages out of a Reliable Queue and process them, this is where that work happens.

如果这是你第一次学习 Reliable Services,请继续阅读!If you're learning about reliable services for the first time, read on! 如果你正在寻找 Reliable Services 生命周期的详细演练,请查看 Reliable Services 生命周期概述If you're looking for a detailed walkthrough of the lifecycle of reliable services, check out Reliable Services lifecycle overview.

示例服务Example services

让我们更详细地了解 Reliable Services 模型如何处理无状态和有状态服务。Let's take a closer look how the Reliable Services model works with both stateless and stateful services.

无状态的 Reliable ServicesStateless Reliable Services

无状态服务是指每次调用后不会在服务中保留状态的服务。 A stateless service is one where there is no state maintained within the service across calls. 现有的任何状态完全可释放,无需同步、复制、保留或高可用性。Any state that is present is entirely disposable and doesn't require synchronization, replication, persistence, or high availability.

以没有内存的计算器为例,它会接收所有项并同时执行运算。For example, consider a calculator that has no memory and receives all terms and operations to perform at once.

在这种情况下,由于服务无需处理任何后台任务,因此,服务的 RunAsync() (C#) 或 runAsync() (Java) 可为空。In this case, the RunAsync() (C#) or runAsync() (Java) of the service can be empty, since there is no background task-processing that the service needs to do. 创建计算器服务后,该服务会返回 ICommunicationListener (C#) 或 CommunicationListener (Java)(例如 Web API),用于在某个端口上打开侦听终结点。When the calculator service is created, it returns an ICommunicationListener (C#) or CommunicationListener (Java) (for example Web API) that opens up a listening endpoint on some port. 此侦听终结点挂接到不同的计算方法(例如:“Add(n1, n2)”),这些方法定义计算器的公共 API。This listening endpoint hooks up to the different calculation methods (example: "Add(n1, n2)") that define the calculator's public API.

从客户端进行调用时,会调用相应的方法,并且计算器服务会对所提供的数据执行运算并返回结果。When a call is made from a client, the appropriate method is invoked, and the calculator service performs the operations on the data provided and returns the result. 它不存储任何状态。It doesn't store any state.

不存储任何内部状态让此示例计算器变得十分简单。Not storing any internal state makes this example calculator simple. 不过大多数服务并不是真正的无状态。But most services aren't truly stateless. 它们是将状态外部化到其他某个存储。Instead, they externalize their state to some other store. (例如,任何依赖在备份存储或缓存中保留会话状态的 Web 应用程序便不是无状态的。)(For example, any web app that relies on keeping session state in a backing store or cache is not stateless.)

Service Fabric 中常见的无状态服务使用示例是作为前端,它公开 Web 应用程序的面向公众的 API。A common example of how stateless services are used in Service Fabric is as a front-end that exposes the public-facing API for a web application. 然后,该前端服务指示有状态服务完成用户的请求。The front-end service then talks to stateful services to complete a user request. 在这种情况下,来自客户端的调用将定向到无状态服务正在侦听的某个已知端口(如 80)。In this case, calls from clients are directed to a known port, such as 80, where the stateless service is listening. 此无状态服务接收调用,并判断调用是否来自可信方以及其目标服务是哪一个。This stateless service receives the call and determines whether the call is from a trusted party and which service it's destined for. 然后,此无状态服务将调用转发到有状态服务的正确分区并等待响应。Then, the stateless service forwards the call to the correct partition of the stateful service and waits for a response. 无状态服务收到响应后,会回复原始客户端。When the stateless service receives a response, it replies to the original client. 此类服务的示例是 Service Fabric 入门示例 (C# / Java),以及该存储库中的其他 Service Fabric 示例。An example of such a service is the Service Fabric Getting Started sample (C# / Java), among other Service Fabric samples in that repo.

有状态的 Reliable ServicesStateful Reliable Services

有状态服务是指必须存在状态的某部分并且使该部分保持一致才能正常运行的服务。 A stateful service is one that must have some portion of state kept consistent and present in order for the service to function. 假设有一个服务不断地根据某个值收到的更新,计算它的滚动平均值。Consider a service that constantly computes a rolling average of some value based on updates it receives. 为此,它必须具有目前需要处理的传入请求集以及目前的平均值。To do this, it must have the current set of incoming requests it needs to process and the current average. 检索、处理并将信息存储在外部存储(比如现在的 Azure Blob 或表存储)的任何服务都是有状态的,Any service that retrieves, processes, and stores information in an external store (such as an Azure blob or table store today) is stateful. 只不过它将其状态保存在外部状态存储中。It just keeps its state in the external state store.

现在的大多数服务将其状态存储在外部,因为外部存储可为该状态提供可靠性、可用性、可伸缩性和一致性。Most services today store their state externally, since the external store is what provides reliability, availability, scalability, and consistency for that state. 在 Service Fabric 中,服务无需将其状态存储在外部。In Service Fabric, services aren't required to store their state externally. Service Fabric 为服务代码和服务状态处理这些要求。Service Fabric takes care of these requirements for both the service code and the service state.

假设我们要编写一个服务来处理映像。Let's say we want to write a service that processes images. 为此,该服务将提取一个映像,然后针对该映像执行一系列转换。To do this, the service takes in an image and the series of conversions to perform on that image. 此服务会返回一个可公开 API(如 ConvertImage(Image i, IList<Conversion> conversions))的通信侦听器(假设为 Web API)。This service returns a communication listener (let's suppose it's a WebAPI) that exposes an API like ConvertImage(Image i, IList<Conversion> conversions). 在收到请求时,服务将请求存储在 IReliableQueue 中,并将某个 ID 返回给客户端,使它能够跟踪该请求。When it receives a request, the service stores it in a IReliableQueue, and returns some ID to the client so it can track the request.

在此服务中,RunAsync() 可能更复杂。In this service, RunAsync() could be more complex. 服务在其 RunAsync() 内部使用一个循环从 IReliableQueue 中提取请求并执行请求的转换。The service has a loop inside its RunAsync() that pulls requests out of IReliableQueue and performs the conversions requested. 结果存储在 IReliableDictionary 中,以便当客户端返回时可以获取其转换后的映像。The results get stored in an IReliableDictionary so that when the client comes back they can get their converted images. 为了确保即使发生故障映像也不丢失,此 Reliable Services 将从队列提取数据、执行转换,并将整个结果存储在事务中。To ensure that even if something fails the image isn't lost, this Reliable Service would pull out of the queue, perform the conversions, and store the result all in a single transaction. 在此情况下,仅当转换完成时,才会从队列中删除消息并将结果存储在结果字典中。In this case, the message is removed from the queue and the results are stored in the result dictionary only when the conversions are complete. 或者,服务可从队列中提取映像,并立即将其存储在远程存储中。Alternatively, the service could pull the image out of the queue and immediately store it in a remote store. 这可以减少服务必须管理的状态数量,但会增大复杂性,因为服务必须保留必要的元数据来管理远程存储。This reduces the amount of state the service has to manage, but increases complexity since the service has to keep the necessary metadata to manage the remote store. 不管使用哪种方法,如果某个环节在中途失败,请求将保留在队列中等待处理。With either approach, if something failed in the middle the request remains in the queue waiting to be processed.

尽管此服务听起来像是典型的 .NET 服务,但存在一个差别:它使用的数据结构(IReliableQueueIReliableDictionary)由 Service Fabric 提供,因此高度可靠、可用且一致。Although this service sounds like a typical .NET service, the difference is that the data structures being used (IReliableQueue and IReliableDictionary) are provided by Service Fabric, and are highly reliable, available, and consistent.

何时使用 Reliable Services APIWhen to use Reliable Services APIs

对于以下情况,请考虑使用 Reliable Services API:Consider Reliable Services APIs if:

  • 希望服务的代码(有时还包括状态)高度可用且可靠。You want your service's code (and optionally state) to be highly available and reliable.
  • 需要跨多个状态单元(例如订单和订单明细)提供事务保证。You need transactional guarantees across multiple units of state (for example, orders and order line items).
  • 应用程序状态可以自然地建模为可靠字典和可靠队列。Your application's state can be naturally modeled as Reliable Dictionaries and Queues.
  • 应用程序代码或状态需要高度可用,读写延迟较低。Your applications code or state needs to be highly available with low latency reads and writes.
  • 应用程序需要跨一个或多个可靠集合控制事务处理操作的并发或精度。Your application needs to control the concurrency or granularity of transacted operations across one or more Reliable Collections.
  • 要为服务管理通信或控制分区方案。You want to manage the communications or control the partitioning scheme for your service.
  • 代码需要自由线程的运行时环境。Your code needs a free-threaded runtime environment.
  • 应用程序需要在运行时动态创建或销毁 Reliable Dictionaries、队列或整个服务。Your application needs to dynamically create or destroy Reliable Dictionaries or Queues or whole Services at runtime.
  • 需要以编程方式为服务状态控制 Service Fabric 提供的备份和还原功能。You need to programmatically control Service Fabric-provided backup and restore features for your service's state.
  • 应用程序需要维护其状态单元的更改历史记录。Your application needs to maintain change history for its units of state.
  • 想要开发或使用第三方开发的自定义状态提供程序。You want to develop or consume third-party-developed, custom state providers.

后续步骤Next steps