Azure Cosmos DB 中的分区和水平缩放Partitioning and horizontal scaling in Azure Cosmos DB

适用于: SQL API Cassandra API Gremlin API 表 API Azure Cosmos DB API for MongoDB

Azure Cosmos DB 使用分区缩放数据库中的单个容器,以满足应用程序的性能需求。Azure Cosmos DB uses partitioning to scale individual containers in a database to meet the performance needs of your application. 在分区中,可将容器中的项分割成不同的子集(称作“逻辑分区”)。In partitioning, the items in a container are divided into distinct subsets called logical partitions. 逻辑分区是根据与容器中每个项关联的分区键值形成的。Logical partitions are formed based on the value of a partition key that is associated with each item in a container. 逻辑分区中的所有项都具有相同的分区键值。All the items in a logical partition have the same partition key value.

例如,某个容器保存项。For example, a container holds items. 每个项具有唯一的 UserID 属性值。Each item has a unique value for the UserID property. 如果 UserID 充当容器中的项的分区键,并且有 1,000 个唯一的 UserID 值,则会为容器创建 1,000 个逻辑分区。If UserID serves as the partition key for the items in the container and there are 1,000 unique UserID values, 1,000 logical partitions are created for the container.

除了用于确定项的逻辑分区的分区键以外,容器中的每个项还有一个项 ID(在逻辑分区中保持唯一)。In addition to a partition key that determines the item's logical partition, each item in a container has an item ID (unique within a logical partition). 将分区键与项 ID 相结合可创建项的索引,用来唯一地标识该项。 Combining the partition key and the item ID creates the item's index, which uniquely identifies the item. 分区键的选择非常重要,这会影响应用程序的性能。Choosing a partition key is an important decision that will affect your application's performance.

本文介绍了逻辑分区与物理分区之间的关系。This article explains the relationship between logical and physical partitions. 还讨论了用于分区的最佳做法,并且深入介绍了横向缩放在 Azure Cosmos DB 中的工作方式。It also discusses best practices for partitioning and gives an in-depth view at how horizontal scaling works in Azure Cosmos DB. 并非一定要了解这些内部详细信息才能选择分区键,但我们还是介绍了这些内容,以便你清晰地了解 Azure Cosmos DB 的工作方式。It's not necessary to understand these internal details to select your partition key but we have covered them so you have clarity on how Azure Cosmos DB works.

逻辑分区Logical partitions

逻辑分区由一组具有相同分区键的项构成。A logical partition consists of a set of items that have the same partition key. 例如,在包含食物营养相关数据的容器中,所有项都包含 foodGroup 属性。For example, in a container that contains data about food nutrition, all items contain a foodGroup property. 可以使用 foodGroup 作为该容器的分区键。You can use foodGroup as the partition key for the container. 具有特定 foodGroup 值(例如 Beef ProductsBaked ProductsSausages and Luncheon Meats)的项组构成了独立的逻辑分区。Groups of items that have specific values for foodGroup, such as Beef Products, Baked Products, and Sausages and Luncheon Meats, form distinct logical partitions. 无需担心在删除基础数据时是否会删除逻辑分区。You don't have to worry about deleting a logical partition when the underlying data is deleted.

逻辑分区也定义数据库事务的范围。A logical partition also defines the scope of database transactions. 可以使用支持快照隔离的事务来更新逻辑分区中的项。You can update items within a logical partition by using a transaction with snapshot isolation. 当向容器中添加新项时,系统将透明地创建新的逻辑分区。When new items are added to a container, new logical partitions are transparently created by the system.

容器中逻辑分区的数量是没有限制的。There is no limit to the number of logical partitions in your container. 每个逻辑分区最多可以存储 20GB 数据。Each logical partition can store up to 20GB of data. 如果分区键的可能值范围广泛,那么这些分区键是良好的分区键选择。Good partition key choices have a wide range of possible values. 例如,在一个其中所有项都包含 foodGroup 属性的容器中,Beef Products 逻辑分区内的数据最多可以增长到 20 GB。For example, in a container where all items contain a foodGroup property, the data within the Beef Products logical partition can grow up to 20 GB. 选择具有多种可能值的分区键会确保容器能够缩放。Selecting a partition key with a wide range of possible values ensures that the container is able to scale.

物理分区Physical partitions

容器是通过在物理分区之间分配数据和吞吐量来进行缩放的。A container is scaled by distributing data and throughput across physical partitions. 在内部,一个或多个逻辑分区映射到一个物理分区。Internally, one or more logical partitions are mapped to a single physical partition. 通常,较小的容器会有许多逻辑分区,但这些容器只需要一个物理分区。Typically smaller containers have many logical partitions but they only require a single physical partition. 与逻辑分区不同,物理分区是系统的内部实现,并且全部由 Azure Cosmos DB 管理。Unlike logical partitions, physical partitions are an internal implementation of the system and they are entirely managed by Azure Cosmos DB.

容器中物理分区的数量取决于以下配置:The number of physical partitions in your container depends on the following configuration:

  • 预配的吞吐量(每个单独的物理分区最多可以提供每秒 10,000 个请求单位的吞吐量)The number of throughput provisioned (each individual physical partition can provide a throughput of up to 10,000 request units per second).
  • 总数据存储量(每个单独的物理分区最多可以存储 50GB 数据)。The total data storage (each individual physical partition can store up to 50GB data).

容器中物理分区的总数是没有限制的。There is no limit to the total number of physical partitions in your container. 随着预配的吞吐量或数据量规模的增长,Azure Cosmos DB 将会通过拆分现有物理分区来自动创建新物理分区。As your provisioned throughput or data size grows, Azure Cosmos DB will automatically create new physical partitions by splitting existing ones. 物理分区拆分不影响应用程序可用性。Physical partition splits do not impact your application's availability. 物理分区拆分后,单个逻辑分区内的所有数据仍将存储在同一个物理分区中。After the physical partition split, all data within a single logical partition will still be stored on the same physical partition. 物理分区拆分只是创建逻辑分区到物理分区的新映射。A physical partition split simply creates a new mapping of logical partitions to physical partitions.

为容器预配的吞吐量在物理分区之间均匀划分。Throughput provisioned for a container is divided evenly among physical partitions. 未均匀分配请求的分区键设计可能会导致过多的请求定向到变“热”的一小组分区。A partition key design that doesn't distribute requests evenly might result in too many requests directed to a small subset of partitions that become "hot." 热分区会导致预配吞吐量的使用效率低下,进而可能会导致速率受限和成本上升。Hot partitions lead to inefficient use of provisioned throughput, which might result in rate-limiting and higher costs.

在 Azure 门户的“指标”边栏选项卡的“存储”部分中,可以看到容器的物理分区 :You can see your container's physical partitions in the Storage section of the Metrics blade of the Azure portal:


在上面的屏幕截图中,容器具有 /foodGroup 作为分区键。In the above screenshot, a container has /foodGroup as the partition key. 图中三个条形中的每一个都表示一个物理分区。Each of the three bars in the graph represents a physical partition. 在此图中,分区键范围与物理分区相同。In the image, partition key range is the same as a physical partition. 选定的物理分区包含三个逻辑分区:Beef ProductsVegetable and Vegetable ProductsSoups, Sauces, and GraviesThe selected physical partition contains three logical partitions: Beef Products, Vegetable and Vegetable Products, and Soups, Sauces, and Gravies.

如果预配每秒 18,000 个请求单位 (RU/s) 的吞吐量,则三个物理分区中的每一个都可以利用总预配吞吐量的 1/3。If you provision a throughput of 18,000 request units per second (RU/s), then each of the three physical partition can utilize 1/3 of the total provisioned throughput. 在选定的物理分区中,逻辑分区键 Beef ProductsVegetable and Vegetable ProductsSoups, Sauces, and Gravies 可以共同利用为物理分区预配的每秒 6,000 个 RU。Within the selected physical partition, the logical partition keys Beef Products, Vegetable and Vegetable Products, and Soups, Sauces, and Gravies can, collectively, utilize the physical partition's 6,000 provisioned RU/s. 由于预配的吞吐量是在容器的物理分区间平均分配的,因此,请务必通过选择正确的逻辑分区键来选择平均分配吞吐量消耗的分区键。Because provisioned throughput is evenly divided across your container's physical partitions, it's important to choose a partition key that evenly distributes throughput consumption by choosing the right logical partition key.


如果选择在逻辑分区间平均分配吞吐量消耗的分区键,将会确保物理分区间的吞吐量消耗保持均衡。If you choose a partition key that evenly distributes throughput consumption across logical partitions, you will ensure that throughput consumption across physical partitions is balanced.

管理逻辑分区Managing logical partitions

Azure Cosmos DB 以透明方式自动管理逻辑分区在物理分区上的位置,以有效满足容器的可伸缩性和性能需求。Azure Cosmos DB transparently and automatically manages the placement of logical partitions on physical partitions to efficiently satisfy the scalability and performance needs of the container. 随着应用程序的吞吐量和存储要求的提高,Azure Cosmos DB 会移动逻辑分区,以便自动在更多的物理分区之间分散负载。As the throughput and storage requirements of an application increase, Azure Cosmos DB moves logical partitions to automatically spread the load across a greater number of physical partitions. 可以详细了解物理分区You can learn more about physical partitions.

Azure Cosmos DB 使用基于哈希的分区在物理分区之间分散逻辑分区。Azure Cosmos DB uses hash-based partitioning to spread logical partitions across physical partitions. Azure Cosmos DB 对项的分区键值进行哈希处理。Azure Cosmos DB hashes the partition key value of an item. 哈希处理结果确定了物理分区。The hashed result determines the physical partition. 然后,Azure Cosmos DB 在物理分区之间均匀分配分区键哈希的键空间。Then, Azure Cosmos DB allocates the key space of partition key hashes evenly across the physical partitions.

只允许针对单个逻辑分区中的项执行事务(在存储过程或触发器中)。Transactions (in stored procedures or triggers) are allowed only against items in a single logical partition.

可以详细了解 Azure Cosmos DB 如何管理分区You can learn more about how Azure Cosmos DB manages partitions. (生成或运行应用程序不需要了解内部详细信息,添加到这里只是为了方便那些好奇的读者。)(It's not necessary to understand the internal details to build or run your applications, but added here for a curious reader.)

副本集Replica sets

每个物理分区都包含一组副本(也称为副本集)。Each physical partition consists of a set of replicas, also referred to as a replica set. 每个副本集都托管数据库引擎的一个实例。Each replica set hosts an instance of the database engine. 副本集使物理分区中存储的数据具有持久性、高可用性和一致性。A replica set makes the data stored within the physical partition durable, highly available, and consistent. 构成物理分区的每个副本均继承该分区的存储配额。Each replica that makes up the physical partition inherits the partition's storage quota. 物理分区的所有副本共同支持分配给物理分区的吞吐量。All replicas of a physical partition collectively support the throughput that's allocated to the physical partition. Azure Cosmos DB 自动管理副本集。Azure Cosmos DB automatically manages replica sets.

通常,较小的容器只需要一个物理分区,但这些容器仍将至少具有 4 个副本。Typically smaller containers only require a single physical partition but they will still have at least 4 replicas.

下图显示了逻辑分区如何映射到多区域分布的物理分区:The following image shows how logical partitions are mapped to physical partitions that are distributed multiple-regionally:

演示 Azure Cosmos DB 分区的插图

选择分区键Choosing a partition key

分区键具有两个组成部分:分区键路径和分区键值。A partition key has two components: partition key path and the partition key value. 假设有一个项{ "userId" :"Andrew", "worksFor":"Microsoft" },如果选择 "userId" 作为分区键,以下是分区键的两个部分:For example, consider an item { "userId" : "Andrew", "worksFor": "Microsoft" } if you choose "userId" as the partition key, the following are the two partition key components:

  • 分区键路径(例如 "/userId")。The partition key path (For example: "/userId"). 分区键路径接受字母数字和下划线 () 字符。The partition key path accepts alphanumeric and underscore() characters. 还可以通过标准路径表示法 (/) 来使用嵌套的对象。You can also use nested objects by using the standard path notation(/).

  • 分区键值(例如 "Andrew")。The partition key value (For example: "Andrew"). 分区键值可以是字符串或数值类型。The partition key value can be of string or numeric types.

若要了解有关吞吐量、存储和分区键长度的限制,请参阅 Azure Cosmos DB 服务配额一文。To learn about the limits on throughput, storage, and length of the partition key, see the Azure Cosmos DB service quotas article.

选择分区键是 Azure Cosmos DB 中的一个简单但重要的设计选择。Selecting your partition key is a simple but important design choice in Azure Cosmos DB. 选择分区键后,将无法就地进行更改。Once you select your partition key, it is not possible to change it in-place. 如果需要更改分区键,应将数据移动到带有所需新分区键的新容器。If you need to change your partition key, you should move your data to a new container with your new desired partition key.

对于 所有 容器,分区键应当:For all containers, your partition key should:

  • 是一个属性,并且其值不会更改。Be a property that has a value which does not change. 如果某个属性是分区键,那么你不能更新该属性的值。If a property is your partition key, you can't update that property's value.

  • 具有较高的基数。Have a high cardinality. 换言之,该属性应具有范围广泛的可能值。In other words, the property should have a wide range of possible values.

  • 将请求单位 (RU) 消耗和数据存储均匀分配到所有逻辑分区上。Spread request unit (RU) consumption and data storage evenly across all logical partitions. 这可确保跨物理分区均匀分配 RU 消耗和存储。This ensures even RU consumption and storage distribution across your physical partitions.

如果在 Azure Cosmos DB 中需要多项 ACID 事务,则需要使用存储过程或触发器If you need multi-item ACID transactions in Azure Cosmos DB, you will need to use stored procedures or triggers. 所有基于 JavaScript 的存储过程和触发器的作用域都是单个逻辑分区。All JavaScript-based stored procedures and triggers are scoped to a single logical partition.

读取密集型容器的分区键Partition keys for read-heavy containers

对于大多数容器,上述条件就是在选择分区键时需要考虑的全部。For most containers, the above criteria is all you need to consider when picking a partition key. 但对于较大的读取密集型容器,可能需要选择在查询中经常作为筛选器出现的分区键。For large read-heavy containers, however, you might want to choose a partition key that appears frequently as a filter in your queries. 通过在筛选器谓词中包含分区键,查询可以高效地专门路由到相关的物理分区Queries can be efficiently routed to only the relevant physical partitions by including the partition key in the filter predicate.

如果大多数工作负荷请求是查询,并且大多数查询在同一属性上都有一个等式筛选器,则此属性可以成为不错的分区键选择。If most of your workload's requests are queries and most of your queries have an equality filter on the same property, this property can be a good partition key choice. 例如,如果经常运行在 UserID 上筛选的查询,则选择 UserID 作为分区键将减少跨分区查询的数目。For example, if you frequently run a query that filters on UserID, then selecting UserID as the partition key would reduce the number of cross-partition queries.

但是,如果容器很小,那么你的物理分区数量可能并非很多,无需你担心跨分区查询的性能影响。However, if your container is small, you probably don't have enough physical partitions to need to worry about the performance impact of cross-partition queries. Azure Cosmos DB 中的大多数小容器只需要一个或两个物理分区。Most small containers in Azure Cosmos DB only require one or two physical partitions.

如果容器可能会增长到许多个物理分区,则应确保选择一个可以最大程度地减少跨分区查询的分区键。If your container could grow to more than a few physical partitions, then you should make sure you pick a partition key that minimizes cross-partition queries. 如果满足以下任一条件,则容器将需要许多个物理分区:Your container will require more than a few physical partitions when either of the following are true:

  • 容器已预配了 30000 以上的 RUYour container will have over 30,000 RU's provisioned

  • 容器将存储超过 100 GB 的数据Your container will store over 100 GB of data

使用项 ID 作为分区键Using item ID as the partition key

如果容器具有一个属性,并且该属性的可能值范围十分广泛,则该属性很可能是非常好的分区键选择。If your container has a property that has a wide range of possible values, it is likely a great partition key choice. 此类属性的一个可能示例是项 ID。One possible example of such a property is the item ID. 对于较小的读取密集型容器或任意大小的写入密集型容器,项 ID 自然是很好的分区键选择。For small read-heavy containers or write-heavy containers of any size, the item ID is naturally a great choice for the partition key.

系统属性“项 ID”存在于容器中的每一项内。The system property item ID exists in every item in your container. 可能会有其他用于表示项逻辑 ID 的属性。You may have other properties that represent a logical ID of your item. 在许多情况下,出于与项 ID 相同的原因,这些属性也会是非常好的分区键选择。In many cases, these are also great partition key choices for the same reasons as the item ID.

项 ID 是很好的分区键选择,原因如下:The item ID is a great partition key choice for the following reasons:

  • 其可能值范围十分广泛(每个项一个唯一的项 ID)。There are a wide range of possible values (one unique item ID per item).
  • 由于每个项都有一个唯一的项 ID,因此,项 ID 在均衡 RU 消耗和数据存储方面有显著作用。Because there is a unique item ID per item, the item ID does a great job at evenly balancing RU consumption and data storage.
  • 你可以轻松执行高效的点读取,因为如果你知道项的项 ID,你将始终知道项的分区键。You can easily do efficient point reads since you'll always know an item's partition key if you know its item ID.

选择项 ID 作为分区键时要考虑的一些事项包括:Some things to consider when selecting the item ID as the partition key include:

  • 如果项 ID 为分区键,则它会成为整个容器中的唯一标识符。If the item ID is the partition key, it will become a unique identifier throughout your entire container. 不同的项不能具有相同的项 ID。You won't be able to have items that have a duplicate item ID.
  • 如果一个读取密集型容器有大量物理分区,则当查询具有一个包含项 ID 的等式筛选器时,查询将更高效。If you have a read-heavy container that has a lot of physical partitions, queries will be more efficient if they have an equality filter with the item ID.
  • 不能跨多个逻辑分区运行存储过程或触发器。You can't run stored procedures or triggers across multiple logical partitions.

后续步骤Next steps