Azure Cosmos DB 中的分区和水平缩放Partitioning and horizontal scaling in Azure Cosmos DB

本文介绍 Azure Cosmos DB 中的物理分区和逻辑分区。This article explains physical and logical partitions in Azure Cosmos DB. 此外,介绍有关缩放和分区的最佳做法。It also discusses best practices for scaling and partitioning.

逻辑分区Logical partitions

逻辑分区由一组具有相同分区键的项构成。A logical partition consists of a set of items that have the same partition key. 例如,在所有项都包含一个 City 属性的容器中,可以使用 City 作为该容器的分区键。For example, in a container where all items contain a City property, you can use City as the partition key for the container. 具有特定 City 值(例如 LondonParisNYC)的项组构成了独立的逻辑分区。Groups of items that have specific values for City, such as London, Paris, and NYC, form distinct logical partitions. 删除基础数据时,无需担心是否会删除分区。You don't have to worry about deleting a partition when the underlying data is deleted.

在 Azure Cosmos DB 中,容器是基本的缩放单元。In Azure Cosmos DB, a container is the fundamental unit of scalability. 添加到容器的数据以及针对容器预配的吞吐量将自动在一组逻辑分区之间(水平)分区。Data that's added to the container and the throughput that you provision on the container are automatically (horizontally) partitioned across a set of logical partitions. 数据和吞吐量是根据为 Azure Cosmos 容器指定的分区键分区的。Data and throughput are partitioned based on the partition key you specify for the Azure Cosmos container. 有关详细信息,请参阅创建 Azure Cosmos 容器For more information, see Create an Azure Cosmos container.

逻辑分区也定义数据库事务的范围。A logical partition also defines the scope of database transactions. 可以使用支持快照隔离的事务来更新逻辑分区中的项。You can update items within a logical partition by using a transaction with snapshot isolation. 当向容器中添加新项时,系统将透明地创建新的逻辑分区。When new items are added to a container, new logical partitions are transparently created by the system.

物理分区Physical partitions

通过将数据和吞吐量分配到大量逻辑分区上来缩放 Azure Cosmos 容器。An Azure Cosmos container is scaled by distributing data and throughput across a large number of logical partitions. 在内部,一个或多个逻辑分区将映射到由一组副本(也称为副本集)构成的物理分区。Internally, one or more logical partitions are mapped to a physical partition that consists of a set of replicas, also referred to as a replica set. 每个副本集托管 Azure Cosmos DB 数据库引擎的一个实例。Each replica set hosts an instance of the Azure Cosmos DB database engine. 副本集使物理分区中存储的数据具有持久性、高可用性和一致性。A replica set makes the data stored within the physical partition durable, highly available, and consistent. 物理分区支持最大数量的存储和请求单位 (RU)。A physical partition supports the maximum amount of storage and request units (RUs). 构成物理分区的每个副本均继承该分区的存储配额。Each replica that makes up the physical partition inherits the partition's storage quota. 物理分区的所有副本共同支持分配给物理分区的吞吐量。All replicas of a physical partition collectively support the throughput that's allocated to the physical partition.

下图显示了逻辑分区如何映射到多区域分布的物理分区:The following image shows how logical partitions are mapped to physical partitions that are distributed multiple-regionally:

演示 Azure Cosmos DB 分区的插图

为容器预配的吞吐量在物理分区之间均匀划分。Throughput provisioned for a container is divided evenly among physical partitions. 不会均匀分配吞吐量请求的分区键设计可能会产生“热”分区。A partition key design that doesn't distribute the throughput requests evenly might create "hot" partitions. 热分区可能导致速率限制、预配吞吐量的低效使用,以及更高的成本。Hot partitions might result in rate-limiting and in inefficient use of the provisioned throughput, and higher costs.

与逻辑分区不同,物理分区是系统的内部实现。Unlike logical partitions, physical partitions are an internal implementation of the system. 无法控制物理分区的大小、位置或计数,也无法控制逻辑分区与物理分区之间的映射。You can't control the size, placement, or count of physical partitions, and you can't control the mapping between logical partitions and physical partitions. 但是,可以通过选择适当的逻辑分区键来控制逻辑分区的数目以及数据、工作负荷和吞吐量的分配。However, you can control the number of logical partitions and the distribution of data, workload and throughput by choosing the right logical partition key.

后续步骤Next steps