在独立 Service Fabric 中定期备份和还原Periodic backup and restore in a standalone Service Fabric

Service Fabric 是一种分布式系统平台,用于轻松开发和管理基于微服务的可靠的分布式云应用程序。Service Fabric is a distributed systems platform that makes it easy to develop and manage reliable, distributed, microservices based cloud applications. 它允许运行无状态和有状态的微服务。It allows running of both stateless and stateful micro services. 有状态服务可在请求和响应或完整的事务之外维持可变的权威状态。Stateful services can maintain mutable, authoritative state beyond the request and response or a complete transaction. 如果有状态服务长时间不可用或由于灾难而丢失信息,可能需要还原到其状态的某个最近备份,以便在其备份后继续提供服务。If a Stateful service goes down for a long time or loses information due to a disaster, it may need to be restored to some recent backup of its state in order to continue providing service after it comes back up.

Service Fabric 跨多个节点复制状态,确保服务高度可用。Service Fabric replicates the state across multiple nodes to ensure that the service is highly available. 即使群集中的一个节点出现故障,服务也将继续可用。Even if one node in the cluster fails, the service continues to be available. 然而,在某些情况下,仍然需要服务数据能够可靠应对更广泛的故障。In certain cases, however, it is still desirable for the service data to be reliable against broader failures.

例如,服务可能要备份其数据,以防止出现以下情况:For example, a service may want to back up its data in order to protect from the following scenarios:

  • 整个 Service Fabric 群集永久丢失。Permanent loss of an entire Service Fabric cluster.
  • 大部分服务分区副本永久丢失Permanent loss of a majority of the replicas of a service partition
  • 状态被意外删除或受损而引起管理错误。Administrative errors whereby the state accidentally gets deleted or corrupted. 例如,具有足够权限的管理员错误地删除了服务。For example, an administrator with sufficient privilege erroneously deletes the service.
  • 服务中的 bug 导致数据损坏。Bugs in the service that cause data corruption. 例如,当某个服务代码升级程序开始将错误数据写入到可靠集合中时可能发生此情况。For example, this may happen when a service code upgrade starts writing faulty data to a Reliable Collection. 在此情况下,代码和数据可能都必须还原到先前的状态。In such a case, both the code and the data may have to be reverted to an earlier state.
  • 离线数据处理。Offline data processing. 对于商业智能来说使用离线处理的数据很方便,此处理是独立于生成数据的服务进行的。It might be convenient to have offline processing of data for business intelligence that happens separately from the service that generates the data.

Service Fabric 提供了一个内置 API,用于执行时间点备份和还原Service Fabric provides an inbuilt API to do point in time backup and restore. 应用程序开发者可使用这些 API 定期备份服务状态。Application developers may use these APIs to back up the state of the service periodically. 此外,如果服务管理员想要在特定时间从服务外部触发备份,就像在升级应用程序之前一样,开发者需要将备份(和还原)作为服务的 API 公开。Additionally, if service administrators want to trigger a backup from outside of the service at a specific time, like before upgrading the application, developers need to expose backup (and restore) as an API from the service. 维护备份是以上操作的额外成本。Maintaining the backups is an additional cost above this. 例如,你可能希望每半小时进行 5 次递增备份,然后进行完整备份。For example, you may want to take five incremental backups every half hour, followed by a full backup. 完整备份后,可删除以前的递增备份。After the full backup, you can delete the prior incremental backups. 此方法需要额外的代码,因而在应用程序开发期间产生额外成本。This approach requires additional code leading to additional cost during application development.

定期备份应用程序数据是管理分布式应用程序以及防范数据丢失或长时间丢失服务可用性的基本要求。Backup of the application data on a periodic basis is a basic need for managing a distributed application and guarding against loss of data or prolonged loss of service availability. Service Fabric 提供可选的备份和还原服务,因此无需编写任何其他代码,便可配置有状态可靠服务(包括角色服务)的定期备份。Service Fabric provides an optional backup and restore service, which allows you to configure periodic backup of stateful Reliable Services (including Actor Services) without having to write any additional code. 它还有助于还原以前执行的备份。It also facilitates restoring previously taken backups.

Service Fabric 提供了一组 API 以实现与定期备份和还原功能相关的以下功能:Service Fabric provides a set of APIs to achieve the following functionality related to periodic backup and restore feature:

  • 通过支持将备份上传到(外部)存储位置,计划可靠有状态服务和 Reliable Actors 的定期备份。Schedule periodic backup of Reliable Stateful services and Reliable Actors with support to upload backup to (external) storage locations. 受支持的存储位置Supported storage locations
    • Azure 存储Azure Storage
    • 文件共享(本地)File Share (on-premises)
  • 枚举备份Enumerate backups
  • 触发分区的临时备份Trigger an ad hoc backup of a partition
  • 使用之前的备份还原分区Restore a partition using previous backup
  • 暂时暂停备份Temporarily suspend backups
  • 备份的保留期管理(即将推出)Retention management of backups (upcoming)

必备条件Prerequisites

  • 具有 Fabric 6.4 或更高版本的 Service Fabric 群集。Service Fabric cluster with Fabric version 6.4 or above. 有关下载所需包的步骤,请参阅文章Refer to this article for steps to download required package.
  • 用于加密机密的 X.509 证书,连接到存储以存储备份时需要此机密。X.509 Certificate for encryption of secrets needed to connect to storage to store backups. 请参阅文章,了解如何获取或创建一个自签名的 X.509 证书。Refer article to know how to acquire or to Create a self-signed X.509 certificate.
  • 使用 Service Fabric SDK 3.0 或更高版本生成的 Service Fabric 可靠有状态应用程序。Service Fabric Reliable Stateful application built using Service Fabric SDK version 3.0 or above. 对于面向 .NET Core 2.0 的应用程序,应使用 Service Fabric SDK 3.1 或更高版本生成应用程序。For applications targeting .NET Core 2.0, application should be built using Service Fabric SDK version 3.1 or above.
  • 安装 Microsoft.ServiceFabric.Powershell.Http模块 [在预览中] 进行配置调用。Install Microsoft.ServiceFabric.Powershell.Http Module [In Preview] for making configuration calls.
Install-Module -Name Microsoft.ServiceFabric.Powershell.Http -AllowPrerelease
  • 请确保在使用 Microsoft.ServiceFabric.Powershell.Http 模块发出任何配置请求之前,先使用 Connect-SFCluster 命令连接群集。Make sure that Cluster is connected using the Connect-SFCluster command before making any configuration request using Microsoft.ServiceFabric.Powershell.Http Module.

Connect-SFCluster -ConnectionEndpoint 'https://mysfcluster.chinaeast.cloudapp.chinacloudapi.cn:19080'   -X509Credential -FindType FindByThumbprint -FindValue '1b7ebe2174649c45474a4819dafae956712c31d3' -StoreLocation 'CurrentUser' -StoreName 'My' -ServerCertThumbprint '1b7ebe2174649c45474a4819dafae956712c31d3'  

启用备份和还原服务Enabling backup and restore service

首先,需要在群集中启用备份和还原服务 。First you need to enable the backup and restore service in your cluster. 获取要部署的群集的模板。Get the template for the cluster that you want to deploy. 可使用示例模板You can use the sample templates. 通过以下步骤启用备份和还原服务 :Enable the backup and restore service with the following steps:

  1. 检查在群集配置文件中 apiversion 是否设置为了 10-2017,如果没有,请按以下代码片段所示进行更新:Check that the apiversion is set to 10-2017 in the cluster configuration file, and if not, update it as shown in the following snippet:

    {
        "apiVersion": "10-2017",
        "name": "SampleCluster",
        "clusterConfigurationVersion": "1.0.0",
        ...
    }
    
  2. 现在,通过在 properties 部分下添加以下 addonFeatures 部分来启用备份和还原服务,如以下代码片段所示:Now enable the backup and restore service by adding the following addonFeatures section under properties section as shown in the following snippet:

        "properties": {
            ...
            "addonFeatures": ["BackupRestoreService"],
            "fabricSettings": [ ... ]
            ...
        }
    
    
  3. 配置 X.509 证书以用于加密凭据。Configure X.509 certificate for encryption of credentials. 此步骤非常重要,可确保在保留之前对提供用于连接存储的凭据(如果有)进行加密。This is important to ensure that the credentials provided, if any, to connect to storage are encrypted before persisting. 通过在 fabricSettings 部分下添加以下 BackupRestoreService 部分来配置加密证书,如以下代码片段所示:Configure encryption certificate by adding the following BackupRestoreService section under fabricSettings section as shown in the following snippet:

    "properties": {
        ...
        "addonFeatures": ["BackupRestoreService"],
        "fabricSettings": [{
            "name": "BackupRestoreService",
            "parameters":  [{
                "name": "SecretEncryptionCertThumbprint",
                "value": "[Thumbprint]"
            }]
        }
        ...
    }
    
  4. 通过前述更改更新群集配置文件后,应用更改并等待部署/升级完成。Once you have updated your cluster configuration file with the preceding changes, apply them and let the deployment/upgrade complete. 完成后,备份和还原服务开始在群集中运行 。Once complete, the backup and restore service starts running in your cluster. 此服务的 URI 为 fabric:/System/BackupRestoreService,并且此服务可位于 Service Fabric Explorer 中系统服务部分下。The Uri of this service is fabric:/System/BackupRestoreService and the service can be located under system service section in the Service Fabric explorer.

启用可靠有状态服务和 Reliable Actors 的定期备份Enabling periodic backup for Reliable Stateful service and Reliable Actors

让我们通过一些步骤来启用可靠有状态服务和 Reliable Actors 的定期备份。Let's walk through steps to enable periodic backup for Reliable Stateful service and Reliable Actors. 这些步骤假定These steps assume

  • 通过备份和还原服务安装群集__。That the cluster is set up with backup and restore service.
  • 在群集上部署了可靠有状态服务。A Reliable Stateful service is deployed on the cluster. 在本快速入门指南中,应用程序 URI 为 fabric:/SampleApp,属于此应用程序的可靠有状态服务的 URI 为 fabric:/SampleApp/MyStatefulServiceFor the purpose of this quickstart guide, application Uri is fabric:/SampleApp and the Uri for Reliable Stateful service belonging to this application is fabric:/SampleApp/MyStatefulService. 使用单个分区部署此服务,分区 ID 为 23aebc1e-e9ea-4e16-9d5c-e91a614fefa7This service is deployed with single partition, and the partition ID is 23aebc1e-e9ea-4e16-9d5c-e91a614fefa7.

创建备份策略Create backup policy

第一步是创建描述备份计划的备份策略、备份数据的目标存储、策略名称、触发完整备份之前允许的最大递增备份以及备份存储的保留策略。First step is to create backup policy describing backup schedule, target storage for backup data, policy name, maximum incremental backups to be allowed before triggering full backup and retention policy for backup storage.

对于备份存储,请创建文件共享并为所有 Service Fabric 节点计算机提供对此文件共享的读写访问权限。For backup storage, create file share and give ReadWrite access to this file share for all Service Fabric Node machines. 此示例假定名为 BackupStore 的共享存在于 StorageServer 上。This example assumes the share with name BackupStore is present on StorageServer.

使用Microsoft.ServiceFabric.Powershell.Http 模块的 PowerShellPowershell using Microsoft.ServiceFabric.Powershell.Http Module


New-SFBackupPolicy -Name 'BackupPolicy1' -AutoRestoreOnDataLoss $true -MaxIncrementalBackups 20 -FrequencyBased -Interval 00:15:00 -FileShare -Path '\\StorageServer\BackupStore' -Basic -RetentionDuration '10.00:00:00'

使用 Powershell 进行 Rest 调用Rest Call using Powershell

执行以下 PowerShell 脚本,调用所需的 REST API 来创建新策略。Execute following PowerShell script for invoking required REST API to create new policy.

$ScheduleInfo = @{
    Interval = 'PT15M'
    ScheduleKind = 'FrequencyBased'
}   

$StorageInfo = @{
    Path = '\\StorageServer\BackupStore'
    StorageKind = 'FileShare'
}

$RetentionPolicy = @{ 
    RetentionPolicyType = 'Basic'
    RetentionDuration =  'P10D'
}

$BackupPolicy = @{
    Name = 'BackupPolicy1'
    MaxIncrementalBackups = 20
    Schedule = $ScheduleInfo
    Storage = $StorageInfo
    RetentionPolicy = $RetentionPolicy
}

$body = (ConvertTo-Json $BackupPolicy)
$url = "http://localhost:19080/BackupRestore/BackupPolicies/$/Create?api-version=6.4"

Invoke-WebRequest -Uri $url -Method Post -Body $body -ContentType 'application/json'

使用 Service Fabric ExplorerUsing Service Fabric Explorer

  1. 在 Service Fabric Explorer 中,导航到“备份”选项卡,然后选择“操作”>“创建备份策略”。In Service Fabric Explorer, navigate to the Backups tab and select Actions > Create Backup Policy.

    创建备份策略

  2. 填写信息。Fill out the information. 对于独立群集,应选择 FileShare。For standalone clusters, FileShare should be selected.

    创建备份策略 FileShare

启用定期备份Enable periodic backup

在定义策略以满足应用程序的数据保护要求后,备份策略应与应用程序相关联。After defining policy to fulfill data protection requirements of the application, the backup policy should be associated with the application. 根据需要,备份策略可与应用程序、服务或分区相关联。Depending on requirement, the backup policy can be associated with an application, service, or a partition.

使用Microsoft.ServiceFabric.Powershell.Http 模块的 PowerShellPowershell using Microsoft.ServiceFabric.Powershell.Http Module

Enable-SFApplicationBackup -ApplicationId 'SampleApp' -BackupPolicyName 'BackupPolicy1'

使用 Powershell 进行 Rest 调用Rest Call using Powershell

执行以下 PowerShell 脚本,调用所需的 REST API,将上面步骤中创建的名为 BackupPolicy1 的备份策略与应用程序 SampleApp 相关联。Execute following PowerShell script for invoking required REST API to associate backup policy with name BackupPolicy1 created in above step with application SampleApp.

$BackupPolicyReference = @{
    BackupPolicyName = 'BackupPolicy1'
}

$body = (ConvertTo-Json $BackupPolicyReference)
$url = "http://localhost:19080/Applications/SampleApp/$/EnableBackup?api-version=6.4"

Invoke-WebRequest -Uri $url -Method Post -Body $body -ContentType 'application/json'

使用 Service Fabric ExplorerUsing Service Fabric Explorer

  1. 选择应用程序,然后访问操作。Select an application and go to action. 单击“启用/更新应用程序备份”。Click Enable/Update Application Backup.

    启用应用程序备份

  2. 最后,选择所需的策略,然后单击“启用备份”。Finally, select the desired policy and click Enable Backup.

    选择策略

验证定期备份是否正常工作Verify that periodic backups are working

对应用程序启用备份后,属于应用程序下的可靠有状态服务和 Reliable Actors 的所有分区将根据关联的备份策略开始定期备份。After enabling backup for the application, all partitions belonging to Reliable Stateful services and Reliable Actors under the application will start getting backed-up periodically as per the associated backup policy.

分区备份运行状况事件

列出备份List Backups

可使用 GetBackups API 来枚举属于应用程序的可靠有状态服务和 Reliable Actors 的所有分区的关联备份 。Backups associated with all partitions belonging to Reliable Stateful services and Reliable Actors of the application can be enumerated using GetBackups API. 根据需要,可为应用程序、服务或分区枚举备份。Depending on requirement, the backups can be enumerated for application, service, or a partition.

使用Microsoft.ServiceFabric.Powershell.Http 模块的 PowerShellPowershell using Microsoft.ServiceFabric.Powershell.Http Module

    Get-SFApplicationBackupList -ApplicationId WordCount     

使用 Powershell 进行 Rest 调用Rest Call using Powershell

执行以下 PowerShell 脚本,调用 HTTP API 来枚举为 SampleApp 应用程序内所有分区创建的备份。Execute following PowerShell script to invoke the HTTP API to enumerate the backups created for all partitions inside the SampleApp application.

$url = "http://localhost:19080/Applications/SampleApp/$/GetBackups?api-version=6.4"

$response = Invoke-WebRequest -Uri $url -Method Get

$BackupPoints = (ConvertFrom-Json $response.Content)
$BackupPoints.Items

上述运行的示例输出:Sample output for the above run:

BackupId                : d7e4038e-2c46-47c6-9549-10698766e714
BackupChainId           : d7e4038e-2c46-47c6-9549-10698766e714
ApplicationName         : fabric:/SampleApp
ServiceName             : fabric:/SampleApp/MyStatefulService
PartitionInformation    : @{LowKey=-9223372036854775808; HighKey=9223372036854775807; ServicePartitionKind=Int64Range; Id=23aebc1e-e9ea-4e16-9d5c-e91a614fefa7}
BackupLocation          : SampleApp\MyStatefulService\23aebc1e-e9ea-4e16-9d5c-e91a614fefa7\2018-04-01 19.39.40.zip
BackupType              : Full
EpochOfLastBackupRecord : @{DataLossNumber=131670844862460432; ConfigurationNumber=8589934592}
LsnOfLastBackupRecord   : 2058
CreationTimeUtc         : 2018-04-01T19:39:40Z
FailureError            : 

BackupId                : 8c21398a-2141-4133-b4d7-e1a35f0d7aac
BackupChainId           : d7e4038e-2c46-47c6-9549-10698766e714
ApplicationName         : fabric:/SampleApp
ServiceName             : fabric:/SampleApp/MyStatefulService
PartitionInformation    : @{LowKey=-9223372036854775808; HighKey=9223372036854775807; ServicePartitionKind=Int64Range; Id=23aebc1e-e9ea-4e16-9d5c-e91a614fefa7}
BackupLocation          : SampleApp\MyStatefulService\23aebc1e-e9ea-4e16-9d5c-e91a614fefa7\2018-04-01 19.54.38.zip
BackupType              : Incremental
EpochOfLastBackupRecord : @{DataLossNumber=131670844862460432; ConfigurationNumber=8589934592}
LsnOfLastBackupRecord   : 2237
CreationTimeUtc         : 2018-04-01T19:54:38Z
FailureError            : 

BackupId                : fc75bd4c-798c-4c9a-beee-e725321f73b2
BackupChainId           : d7e4038e-2c46-47c6-9549-10698766e714
ApplicationName         : fabric:/SampleApp
ServiceName             : fabric:/SampleApp/MyStatefulService
PartitionInformation    : @{LowKey=-9223372036854775808; HighKey=9223372036854775807; ServicePartitionKind=Int64Range; Id=23aebc1e-e9ea-4e16-9d5c-e91a614fefa7}
BackupLocation          : SampleApp\MyStatefulService\23aebc1e-e9ea-4e16-9d5c-e91a614fefa7\2018-04-01 20.09.44.zip
BackupType              : Incremental
EpochOfLastBackupRecord : @{DataLossNumber=131670844862460432; ConfigurationNumber=8589934592}
LsnOfLastBackupRecord   : 2437
CreationTimeUtc         : 2018-04-01T20:09:44Z
FailureError            : 

使用 Service Fabric ExplorerUsing Service Fabric Explorer

若要在 Service Fabric Explorer 中查看备份,请导航到一个分区,然后选择“备份”选项卡。To view backups in Service Fabric Explorer, navigate to a partition and select the Backups tab.

枚举备份

限制/注意事项Limitation/ caveats

  • Service Fabric PowerShell cmdlet 处于预览模式。Service Fabric PowerShell cmdlets are in preview mode.
  • Linux 上不支持 Service Fabric 群集。No support for Service Fabric clusters on Linux.

后续步骤Next steps