使用分片映射管理器扩大数据库Scale out databases with the shard map manager

适用于:是Azure SQL 数据库 APPLIES TO: yesAzure SQL Database

要轻松横向扩展 Azure SQL 数据库上的数据库,请使用分片映射管理器。To easily scale out databases on Azure SQL Database, use a shard map manager. 分片映射管理器是一个特殊的数据库,它维护一个分片集中有关所有分片(数据库)的全局映射信息。The shard map manager is a special database that maintains global mapping information about all shards (databases) in a shard set. 元数据允许应用程序基于 分片键值连接到正确的数据库。The metadata allows an application to connect to the correct database based upon the value of the sharding key. 此外,在集中的每个分片都包含跟踪本地分片数据的映射(称为 shardlet)。In addition, every shard in the set contains maps that track the local shard data (known as shardlets).

分片映射管理

了解如何构建这些映射对于分片映射管理至关重要。Understanding how these maps are constructed is essential to shard map management. 使用弹性数据库客户端库中的 ShardMapManager 类(Java.NET)来完成此操作。This is done using the ShardMapManager class (Java, .NET, found in the Elastic Database client library to manage shard maps.

分片映射Shard maps and shard mappings

对于每个分片,必须选择要创建的分片映射类型。For each shard, you must select the type of shard map to create. 选择取决于数据库架构:The choice depends on the database architecture:

  1. 每个数据库一个租户Single tenant per database
  2. 每个数据库多个租户(两种类型):Multiple tenants per database (two types):
    1. 列表映射List mapping
    2. 范围映射Range mapping

对于单租户模型,创建列表映射分片映射。For a single-tenant model, create a list-mapping shard map. 单租户模型将每个租户分配给一个数据库。The single-tenant model assigns one database per tenant. 这是适用于 SaaS 开发人员的有效模型,因为它可以简化管理。This is an effective model for SaaS developers as it simplifies management.

列表映射

多租户模型将数个租户分配给单个数据库(可跨多个数据库分布租户组)。The multi-tenant model assigns several tenants to an individual database (and you can distribute groups of tenants across multiple databases). 如果希望每个租户具有较小的数据需求,请使用此模型。Use this model when you expect each tenant to have small data needs. 在此模型中,使用范围映射将一系列用户分配到数据库。In this model, assign a range of tenants to a database using range mapping.

范围映射

或可以使用列表映射来实现多租户数据库模型,以将多个租户分配给单个数据库。Or you can implement a multi-tenant database model using a list mapping to assign multiple tenants to an individual database. 例如,DB1 用于存储租户 ID 1 和 5 的相关信息,而 DB2 用于存储租户 7 和租户 10 的数据。For example, DB1 is used to store information about tenant ID 1 and 5, and DB2 stores data for tenant 7 and tenant 10.

单一数据库上的多个租户

支持的分片键的类型Supported types for sharding keys

灵活扩展支持将以下类型用作分片键:Elastic Scale support the following types as sharding keys:

.NET.NET JavaJava
integerinteger integerinteger
longlong longlong
GUIDguid uuiduuid
byte[]byte[] byte[]byte[]
datetimedatetime timestamptimestamp
timespantimespan durationduration
datetimeoffsetdatetimeoffset offsetdatetimeoffsetdatetime

列表和范围分片映射List and range shard maps

使用各个分片键值的列表或分片键值的范围可构造分片映射 。Shard maps can be constructed using lists of individual sharding key values, or they can be constructed using ranges of sharding key values.

列表分片映射List shard maps

分片包含 shardlet,shardlet 到分片的映射由分片映射维护 。Shards contain shardlets and the mapping of shardlets to shards is maintained by a shard map. 列表分片映射 是可标识 shardlet 的单独键值和可用作分片的数据库之间的关联项。A list shard map is an association between the individual key values that identify the shardlets and the databases that serve as shards. 是可以映射到同一个数据库的显式且不同的键值。List mappings are explicit and different key values can be mapped to the same database. 例如,键值 1 映射到数据库 A,键值 3 和 6 都映射到数据库 B。For example, key value 1 maps to Database A, and key values 3 and 6 both maps to Database B.

Key 分片位置Shard Location
11 Database_ADatabase_A
33 Database_BDatabase_B
44 Database_CDatabase_C
66 Database_BDatabase_B
...... ......

范围分片映射Range shard maps

在范围分片映射中,键范围由 [Low Value, High Value) 对描述,其中 Low Value 是范围中的最小键,而 High Value 是第一个大于范围的值 。In a range shard map, the key range is described by a pair [Low Value, High Value) where the Low Value is the minimum key in the range, and the High Value is the first value higher than the range.

例如,[0, 100) 包括所有大于或等于 0 且小于 100 的整数。For example, [0, 100) includes all integers greater than or equal 0 and less than 100. 请注意,多个范围可指向同一数据库,并且支持多个不连续的范围(例如,在下面的示例中,[100,200) 和 [400,600) 可同时指向数据库 C。)Note that multiple ranges can point to the same database, and disjoint ranges are supported (for example, [100,200) and [400,600) both point to Database C in the following example.)

Key 分片位置Shard Location
[1,50)[1,50) Database_ADatabase_A
[50,100)[50,100) Database_BDatabase_B
[100,200)[100,200) Database_CDatabase_C
[400,600)[400,600) Database_CDatabase_C
...... ......

上面所示的每个表都是 ShardMap 对象的概念性示例。Each of the tables shown above is a conceptual example of a ShardMap object. 每一行都是单个 PointMapping(适用于列表分片映射)对象或 RangeMapping(适用于范围分片映射)对象的简化示例 。Each row is a simplified example of an individual PointMapping (for the list shard map) or RangeMapping (for the range shard map) object.

分片映射管理器Shard map manager

在客户端库中,分片映射管理器是分片映射的集合。In the client library, the shard map manager is a collection of shard maps. ShardMapManager 实例管理的数据保存在以下三个位置中:The data managed by a ShardMapManager instance is kept in three places:

  1. 全局分片映射 (GSM) :指定一个数据库以用作它的所有分片映射和映射的存储库。Global Shard Map (GSM): You specify a database to serve as the repository for all of its shard maps and mappings. 自动创建特殊的表和存储过程以管理信息。Special tables and stored procedures are automatically created to manage the information. 这通常是小型数据库且可轻松进行访问,但不应用于满足应用程序的其他需求。This is typically a small database and lightly accessed, and it should not be used for other needs of the application. 这些表位于名为 __ShardManagement 的特殊架构中。The tables are in a special schema named __ShardManagement.
  2. 局部分片映射 (LSM) :修改指定为分片的每个数据库,以包含多个小表和特殊存储过程,其中包括特定于该分片的分片映射信息并对其进行管理。Local Shard Map (LSM): Every database that you specify to be a shard is modified to contain several small tables and special stored procedures that contain and manage shard map information specific to that shard. 对于 GSM 中的信息而言,该信息是冗余的,但应用程序通过该信息可验证缓存的分片映射信息,而无需将所有负载置于 GSM 上;应用程序可使用 LSM 确定缓存的映射是否仍然有效。This information is redundant with the information in the GSM, and it allows the application to validate cached shard map information without placing any load on the GSM; the application uses the LSM to determine if a cached mapping is still valid. 与每个分片上的 LSM 对应的表也位于架构 __ShardManagement 中。The tables corresponding to the LSM on each shard are also in the schema __ShardManagement.
  3. 应用程序缓存:每个用于访问 ShardMapManager 对象的应用程序实例都可维护其映射的本地内存中缓存。Application cache: Each application instance accessing a ShardMapManager object maintains a local in-memory cache of its mappings. 它存储最近检索到的路由信息。It stores routing information that has recently been retrieved.

构造 ShardMapManagerConstructing a ShardMapManager

ShardMapManager 对象是使用工厂(Java.NET)模式构造的。A ShardMapManager object is constructed using a factory (Java, .NET) pattern. 通过 ShardMapManagerFactory.GetSqlShardMapManager(Java.NET)方法可获取具有 ConnectionString 形式的凭据(包括用于保存 GSM 的服务器名称和数据库名称),并返回 ShardMapManager 的实例。The ShardMapManagerFactory.GetSqlShardMapManager (Java, .NET) method takes credentials (including the server name and database name holding the GSM) in the form of a ConnectionString and returns an instance of a ShardMapManager.

请注意: 在应用程序的初始化代码内,每个应用域只应实例化 ShardMapManager 一次。Please Note: The ShardMapManager should be instantiated only once per app domain, within the initialization code for an application. 在同一个应用域中创建 ShardMapManager 的其他实例将导致应用程序的内存增加且 CPU 使用率增加。Creation of additional instances of ShardMapManager in the same app domain results in increased memory and CPU utilization of the application. ShardMapManager 可包含任意数量的分片映射。A ShardMapManager can contain any number of shard maps. 尽管对于许多应用程序而言,单个分片映射可能是足够的,但有时针对不同的架构或出于特定目的,需使用不同的数据库集,在这些情况下多个分片映射可能更合适。While a single shard map may be sufficient for many applications, there are times when different sets of databases are used for different schema or for unique purposes; in those cases multiple shard maps may be preferable.

在此代码中,应用程序尝试使用 TryGetSqlShardMapManager(Java.NET)方法打开现有的 ShardMapManager。In this code, an application tries to open an existing ShardMapManager with the TryGetSqlShardMapManager (Java, .NET method. 如果数据库中尚不存在表示全局 ShardMapManager (GSM) 的对象,客户端库会使用 CreateSqlShardMapManager(Java.NET)方法创建这些对象。If objects representing a Global ShardMapManager (GSM) do not yet exist inside the database, the client library creates them using the CreateSqlShardMapManager (Java, .NET) method.

// Try to get a reference to the Shard Map Manager in the shardMapManager database.
// If it doesn't already exist, then create it.
ShardMapManager shardMapManager = null;
boolean shardMapManagerExists = ShardMapManagerFactory.tryGetSqlShardMapManager(shardMapManagerConnectionString,ShardMapManagerLoadPolicy.Lazy, refShardMapManager);
shardMapManager = refShardMapManager.argValue;

if (shardMapManagerExists) {
    ConsoleUtils.writeInfo("Shard Map %s already exists", shardMapManager);
}
else {
    // The Shard Map Manager does not exist, so create it
    shardMapManager = ShardMapManagerFactory.createSqlShardMapManager(shardMapManagerConnectionString);
    ConsoleUtils.writeInfo("Created Shard Map %s", shardMapManager);
}
// Try to get a reference to the Shard Map Manager via the Shard Map Manager database.  
// If it doesn't already exist, then create it.
ShardMapManager shardMapManager;
bool shardMapManagerExists = ShardMapManagerFactory.TryGetSqlShardMapManager(
                                        connectionString,
                                        ShardMapManagerLoadPolicy.Lazy,
                                        out shardMapManager);

if (shardMapManagerExists)
{
    Console.WriteLine("Shard Map Manager already exists");
}
else
{
    // Create the Shard Map Manager.
    ShardMapManagerFactory.CreateSqlShardMapManager(connectionString);
    Console.WriteLine("Created SqlShardMapManager");

    shardMapManager = ShardMapManagerFactory.GetSqlShardMapManager(
            connectionString,
            ShardMapManagerLoadPolicy.Lazy);

// The connectionString contains server name, database name, and admin credentials for privileges on both the GSM and the shards themselves.
}

对于 .NET 版本,可以使用 PowerShell 来创建新的分片映射管理器。For the .NET version, you can use PowerShell to create a new Shard Map Manager. 此处提供了一个示例。An example is available here.

获取 RangeShardMap 或 ListShardMapGet a RangeShardMap or ListShardMap

创建分片映射管理器以后,可以使用 TryGetRangeShardMap(Java.NET)、TryGetListShardMap(Java.NET)或 GetShardMap(Java.NET)方法获取 RangeShardMap(Java.NET)或 ListShardMap(Java.NET)。After creating a shard map manager, you can get the RangeShardMap (Java, .NET) or ListShardMap (Java, .NET) using the TryGetRangeShardMap (Java, .NET), the TryGetListShardMap (Java, .NET), or the GetShardMap (Java, .NET) method.

// Creates a new Range Shard Map with the specified name, or gets the Range Shard Map if it already exists.
static <T> RangeShardMap<T> createOrGetRangeShardMap(ShardMapManager shardMapManager,
            String shardMapName,
            ShardKeyType keyType) {
    // Try to get a reference to the Shard Map.
    ReferenceObjectHelper<RangeShardMap<T>> refRangeShardMap = new ReferenceObjectHelper<>(null);
    boolean isGetSuccess = shardMapManager.tryGetRangeShardMap(shardMapName, keyType, refRangeShardMap);
    RangeShardMap<T> shardMap = refRangeShardMap.argValue;

    if (isGetSuccess && shardMap != null) {
        ConsoleUtils.writeInfo("Shard Map %1$s already exists", shardMap.getName());
    }
    else {
        // The Shard Map does not exist, so create it
        try {
            shardMap = shardMapManager.createRangeShardMap(shardMapName, keyType);
        }
        catch (Exception e) {
            e.printStackTrace();
        }
        ConsoleUtils.writeInfo("Created Shard Map %1$s", shardMap.getName());
    }

    return shardMap;
}
// Creates a new Range Shard Map with the specified name, or gets the Range Shard Map if it already exists.
public static RangeShardMap<T> CreateOrGetRangeShardMap<T>(ShardMapManager shardMapManager, string shardMapName)
{
    // Try to get a reference to the Shard Map.
    RangeShardMap<T> shardMap;
    bool shardMapExists = shardMapManager.TryGetRangeShardMap(shardMapName, out shardMap);

    if (shardMapExists)
    {
        ConsoleUtils.WriteInfo("Shard Map {0} already exists", shardMap.Name);
    }
    else
    {
        // The Shard Map does not exist, so create it
        shardMap = shardMapManager.CreateRangeShardMap<T>(shardMapName);
        ConsoleUtils.WriteInfo("Created Shard Map {0}", shardMap.Name);
    }

    return shardMap;
}

分片映射管理凭据Shard map administration credentials

用于管理和操作分片映射的应用程序不同于那些使用分片映射路由连接的应用程序。Applications that administer and manipulate shard maps are different from those that use the shard maps to route connections.

若要管理分片映射(添加或更改分片、分片映射等),必须使用在 GSM 数据库和用作分片的每个数据库上都具有读/写权限的凭据实例化 ShardMapManager 。To administer shard maps (add or change shards, shard maps, shard mappings, etc.) you must instantiate the ShardMapManager using credentials that have read/write privileges on both the GSM database and on each database that serves as a shard. 在输入或更改分片映射信息时,这些凭据必须允许编写 GSM 和 LSM 中的表,以及在新分片上创建 LSM 表。The credentials must allow for writes against the tables in both the GSM and LSM as shard map information is entered or changed, as well as for creating LSM tables on new shards.

请参阅用于访问弹性数据库客户端库的凭据See Credentials used to access the Elastic Database client library.

仅元数据受影响Only metadata affected

用于填充或更改 ShardMapManager 数据的方法不会更改存储在分片本身中的用户数据。Methods used for populating or changing the ShardMapManager data do not alter the user data stored in the shards themselves. 例如,类似于 CreateShard、DeleteShard、UpdateMapping 等的方法仅影响分片映射元数据 。For example, methods such as CreateShard, DeleteShard, UpdateMapping, etc. affect the shard map metadata only. 不会删除、添加或更改分片中所包含的用户数据。They do not remove, add, or alter user data contained in the shards. 但是,这些方法旨在与你执行的单独操作结合使用,以创建或删除实际数据库,或者将行从一个分片移动到另一个分片,以使分片环境恢复均衡。Instead, these methods are designed to be used in conjunction with separate operations you perform to create or remove actual databases, or that move rows from one shard to another to rebalance a sharded environment. (弹性数据库工具附带的拆分-合并工具将使用这些 API 并安排在分片之间移动实际数据。)请参阅使用弹性数据库拆分/合并工具进行缩放(The split-merge tool included with elastic database tools makes use of these APIs along with orchestrating actual data movement between shards.) See Scaling using the Elastic Database split-merge tool.

依赖于数据的路由Data dependent routing

分片映射管理器主要由需要数据库连接的应用程序用来执行特定于应用的数据操作。The shard map manager is used in applications that require database connections to perform the app-specific data operations. 这些连接必须与正确的数据库关联。Those connections must be associated with the correct database. 这称为 依赖于数据的路由This is known as Data Dependent Routing. 对于这些应用程序,通过使用在 GSM 数据库上具有只读访问权限的凭据,实例化来自工厂的分片映射管理器对象。For these applications, instantiate a shard map manager object from the factory using credentials that have read-only access on the GSM database. 以后,单独的连接请求将提供连接相应分片数据库时所需的凭据。Individual requests for later connections supply credentials necessary for connecting to the appropriate shard database.

请注意,这些应用程序(使用具有只读权限的凭据打开的 ShardMapManager)无法对映射进行更改。Note that these applications (using ShardMapManager opened with read-only credentials) cannot make changes to the maps or mappings. 为了满足这些需求,请创建特定于管理的应用程序或 PowerShell 脚本,以提供如前所述的更高级别权限的凭据。For those needs, create administrative-specific applications or PowerShell scripts that supply higher-privileged credentials as discussed earlier. 请参阅用于访问弹性数据库客户端库的凭据See Credentials used to access the Elastic Database client library.

有关详细信息,请参阅数据依赖型路由For more information, see Data dependent routing.

修改分片映射Modifying a shard map

可采用不同方式更改分片映射。A shard map can be changed in different ways. 以下所有方法都可修改用于描述分片及其映射的元数据,但这些方法不以物理方式修改分片内的数据,也不创建或删除实际数据库。All of the following methods modify the metadata describing the shards and their mappings, but they do not physically modify data within the shards, nor do they create or delete the actual databases. 下面所述的分片映射上的某些操作可能需要与以物理方式移动数据或添加和删除用作分片的数据库的管理操作进行协调。Some of the operations on the shard map described below may need to be coordinated with administrative actions that physically move data or that add and remove databases serving as shards.

这些方法作为构建基块一同工作,以便在分片的数据库环境中修改数据的总体分发情况。These methods work together as the building blocks available for modifying the overall distribution of data in your sharded database environment.

  • 若要添加或删除分片:请使用 shardmap(Java.NET)类的 CreateShard(Java.NET)和 DeleteShard(Java.NET)。To add or remove shards: use CreateShard (Java, .NET) and DeleteShard (Java, .NET) of the shardmap (Java, .NET) class.

    若要执行这些操作,表示目标分片的服务器和数据库必须已经存在。The server and database representing the target shard must already exist for these operations to execute. 这些方法不会对数据库本身产生任何影响,仅对分片映射上的元数据产生影响。These methods do not have any impact on the databases themselves, only on metadata in the shard map.

  • 若要创建或删除映射到分片的点或范围:请使用 RangeShardMapping(Java.NET)类的 CreateRangeMapping(Java.NET)和 DeleteMapping(Java.NET),以及 ListShardMap(Java.NET)类的 CreatePointMapping(Java.NET)。To create or remove points or ranges that are mapped to the shards: use CreateRangeMapping (Java, .NET), DeleteMapping (Java, .NET) of the RangeShardMapping (Java, .NET) class, and CreatePointMapping (Java, .NET) of the ListShardMap (Java, .NET) class.

    许多不同的点或范围可映射到相同的分片。Many different points or ranges can be mapped to the same shard. 这些方法仅影响元数据,而不会影响已显示在分片中的任何数据。These methods only affect metadata - they do not affect any data that may already be present in shards. 如果为了与 DeleteMapping 操作保持一致而需要将数据从数据库中删除,需要单独执行这些操作,但需要结合使用这些方法。If data needs to be removed from the database in order to be consistent with DeleteMapping operations, you perform those operations separately but in conjunction with using these methods.

  • 若要将现有的范围拆分为两个,或将相邻的范围合并为一个:请使用 SplitMapping(Java.NET)和 MergeMappings(Java.NET)。To split existing ranges into two, or merge adjacent ranges into one: use SplitMapping (Java, .NET) and MergeMappings (Java, .NET).

    请注意,拆分和合并操作 不更改键值要映射到的分片Note that split and merge operations do not change the shard to which key values are mapped. 拆分操作可将现有范围拆分为两个部分,但在映射到相同分片时同时保留这两个部分。A split breaks an existing range into two parts, but leaves both as mapped to the same shard. 对在已映射到相同分片的两个相邻范围进行合并操作,从而可将其合并到单个范围中。A merge operates on two adjacent ranges that are already mapped to the same shard, coalescing them into a single range. 要在分片之间移动点或范围本身,需要将 UpdateMapping 与移动的实际数据结合使用,才能进行协调。The movement of points or ranges themselves between shards needs to be coordinated by using UpdateMapping in conjunction with actual data movement. 当需要移动数据时,可以使用弹性数据库工具中随附的拆分/合并服务,以将分片映射更改与数据移动相协调。You can use the Split/Merge service that is part of elastic database tools to coordinate shard map changes with data movement, when movement is needed.

  • 若要将单独的点或范围重新映射(或移动)到不同的分片:请使用 UpdateMapping(Java.NET)。To re-map (or move) individual points or ranges to different shards: use UpdateMapping (Java, .NET).

    由于可能需要将数据从一个分片移动到另一个分片,以便与 UpdateMapping 操作保持一致,因此需要单独执行此移动,但需要结合使用这些方法。Since data may need to be moved from one shard to another in order to be consistent with UpdateMapping operations, you need to perform that movement separately but in conjunction with using these methods.

  • 若要在联机和脱机状态下执行映射:请使用 MarkMappingOffline(Java.NET)和 MarkMappingOnline(Java.NET)来控制映射的联机状态。To take mappings online and offline: use MarkMappingOffline (Java, .NET) and MarkMappingOnline (Java, .NET) to control the online state of a mapping.

    仅当映射处于“脱机”状态时才允许在分片映射上进行某些操作,其中包括 UpdateMapping 和 DeleteMapping 。Certain operations on shard mappings are only allowed when a mapping is in an "offline" state, including UpdateMapping and DeleteMapping. 当映射处于脱机状态时,基于该映射中所包含的键的数据依赖请求将返回一个错误。When a mapping is offline, a data-dependent request based on a key included in that mapping returns an error. 此外,当范围首次处于脱机状态时,所有到受影响分片的连接都会自动终止,以防止因范围的更改而导致查询出现不一致或不完整的结果。In addition, when a range is first taken offline, all connections to the affected shard are automatically killed in order to prevent inconsistent or incomplete results for queries directed against ranges being changed.

映射是 .NET 中的不可变对象。Mappings are immutable objects in .NET. 以上会更改映射的所有方法也会使代码中任何对映射的引用失效。All of the methods above that change mappings also invalidate any references to them in your code. 为了更轻松地执行操作序列来更改映射的状态,所有会更改映射的方法都将返回新的映射引用,以便能够链接操作。To make it easier to perform sequences of operations that change a mapping's state, all of the methods that change a mapping return a new mapping reference, so operations can be chained. 例如,若要在 shardmap sm 中删除包含键 25 的现有映射,可以执行以下命令:For example, to delete an existing mapping in shardmap sm that contains the key 25, you can execute the following:

    sm.DeleteMapping(sm.MarkMappingOffline(sm.GetMappingForKey(25)));

添加分片Adding a shard

对于已经存在的分片映射,应用程序通常仅需要添加新分片,以处理预期的新键或键范围数据。Applications often need to add new shards to handle data that is expected from new keys or key ranges, for a shard map that already exists. 例如,由租户 ID 分片的应用程序可能需要为新的租户预配新分片,或者在每个新的月份开始之前,每月分片的数据可能需要预配新分片。For example, an application sharded by Tenant ID may need to provision a new shard for a new tenant, or data sharded monthly may need a new shard provisioned before the start of each new month.

如果新的键值范围还不是现有映射的组成部分且无需移动数据,则添加新分片以及将新的键或范围关联到该分片非常简单。If the new range of key values is not already part of an existing mapping and no data movement is necessary, it is simple to add the new shard and associate the new key or range to that shard. 有关添加新分片的详细信息,请参阅添加新分片For details on adding new shards, see Adding a new shard.

但是,在需要移动数据的情况下,需要拆分/合并工具并结合使用必要的分片映射更新,才能安排在分片之间移动数据。For scenarios that require data movement, however, the split-merge tool is needed to orchestrate the data movement between shards in combination with the necessary shard map updates. 有关使用拆分合并工具的详细信息,请参阅拆分/合并概述For details on using the split-merge tool, see Overview of split-merge

其他资源Additional resources

尚未使用弹性数据库工具?Not using elastic database tools yet? 请查看入门指南Check out our Getting Started Guide.