在容器和数据库上预配吞吐量Provision throughput on containers and databases

Azure Cosmos 数据库是一组容器的管理单元。An Azure Cosmos database is a unit of management for a set of containers. 数据库包含一组不限架构的容器。A database consists of a set of schema-agnostic containers. Azure Cosmos 容器是吞吐量和存储的缩放单元。An Azure Cosmos container is the unit of scalability for both throughput and storage. 容器跨 Azure 区域中的一组计算机水平分区,并分布在与 Azure Cosmos 帐户关联的所有 Azure 中国区域中。A container is horizontally partitioned across a set of machines within an Azure region and is distributed across all Azure China regions associated with your Azure Cosmos account.

使用 Azure Cosmos DB 时,可以在两个粒度级别预配吞吐量:With Azure Cosmos DB, you can provision throughput at two granularities:

  • Azure Cosmos 容器Azure Cosmos containers
  • Azure Cosmos 数据库Azure Cosmos databases

对容器设置吞吐量Set throughput on a container

在 Azure Cosmos 容器上预配的吞吐量专门保留给该容器使用。The throughput provisioned on an Azure Cosmos container is exclusively reserved for that container. 容器始终可获得预配的吞吐量。The container receives the provisioned throughput all the time. 对容器预配的吞吐量有 SLA 提供的经济保障。The provisioned throughput on a container is financially backed by SLAs. 若要了解如何在容器上配置吞吐量,请参阅在 Azure Cosmos 容器上预配吞吐量To learn how to configure throughput on a container, see Provision throughput on an Azure Cosmos container.

在容器上设置预配吞吐量是最频繁使用的选项。Setting provisioned throughput on a container is the most frequently used option. 可以通过使用请求单位 (RU) 预配任意数量的吞吐量来弹性缩放容器的吞吐量。You can elastically scale throughput for a container by provisioning any amount of throughput by using Request Units (RUs).

对 Azure Cosmos 容器预配的吞吐量均匀分布于该容器的所有逻辑分区之间。Throughput provisioned on an Azure Cosmos container is uniformly distributed across all the logical partitions of the container. 无法选择性地指定逻辑分区的吞吐量。You cannot selectively specify the throughput for logical partitions. 由于某个容器的一个或多个逻辑分区由物理分区托管,因此,物理分区专属于该容器,并支持对该容器预配的吞吐量。Because one or more logical partitions of a container are hosted by a physical partition, the physical partitions belong exclusively to the container and support the throughput provisioned on the container.

如果逻辑分区上运行的工作负荷消耗的吞吐量超过了分配给该逻辑分区的吞吐量,操作将受到速率限制。If the workload running on a logical partition consumes more than the throughput that was allocated to that logical partition, your operations get rate-limited. 出现速率限制时,可以增大整个容器的预配吞吐量,或重试操作。When rate-limiting occurs, you can either increase the provisioned throughput for the entire container or retry the operations. 有关分区的详细信息,请参阅逻辑分区For more information on partitioning, see Logical partitions.

如果你希望容器的性能有保证,则我们建议以容器粒度配置吞吐量。We recommend that you configure throughput at the container granularity when you want guaranteed performance for the container.

下图显示了物理分区如何托管容器的一个或多个逻辑分区:The following image shows how a physical partition hosts one or more logical partitions of a container:

物理分区

对数据库设置吞吐量Set throughput on a database

对 Azure Cosmos 数据库预配吞吐量时,在该数据库中的所有容器(称作共享的数据库容器)之间共享吞吐量。When you provision throughput on an Azure Cosmos database, the throughput is shared across all the containers (called shared database containers) in the database. 一种例外是在数据库中的特定容器上指定了预配的吞吐量。An exception is if you specified a provisioned throughput on specific containers in the database. 在容器之间共享数据库级预配吞吐量相当于在计算机群集上托管数据库。Sharing the database-level provisioned throughput among its containers is analogous to hosting a database on a cluster of machines. 由于数据库中的所有容器共享一台计算机上的可用资源,因此,任何特定容器的性能自然不可预测。Because all containers within a database share the resources available on a machine, you naturally do not get predictable performance on any specific container. 若要了解如何在数据库上配置预配吞吐量,请参阅在 Azure Cosmos 数据库上配置预配吞吐量To learn how to configure provisioned throughput on a database, see Configure provisioned throughput on an Azure Cosmos database.

在 Azure Cosmos 数据库上设置吞吐量可保证随时能够获得该数据库的预配吞吐量。Setting throughput on an Azure Cosmos database guarantees that you receive the provisioned throughput for that database all the time. 由于数据库中的所有容器共享预配的吞吐量,因此,Azure Cosmos DB 不会针对该数据库中的特定容器提供任何可预测的吞吐量保证。Because all containers within the database share the provisioned throughput, Azure Cosmos DB doesn't provide any predictable throughput guarantees for a particular container in that database. 特定容器可获得的吞吐量部分取决于:The portion of the throughput that a specific container can receive is dependent on:

  • 容器数量。The number of containers.
  • 为各个容器选择的分区键。The choice of partition keys for various containers.
  • 工作负荷在容器的各个逻辑分区之间的分布形式。The distribution of the workload across various logical partitions of the containers.

若要在多个容器之间共享吞吐量,而不希望将吞吐量专门提供给任何特定的容器使用,则我们建议对数据库配置吞吐量。We recommend that you configure throughput on a database when you want to share the throughput across multiple containers, but don't want to dedicate the throughput to any particular container.

以下示例演示了最适合在数据库级别的哪个位置预配吞吐量:The following examples demonstrate where it's preferred to provision throughput at the database level:

  • 对于多租户应用程序来说,在一组容器之间共享数据库的预配吞吐量非常有用。Sharing a database's provisioned throughput across a set of containers is useful for a multitenant application. 每个用户可由不同的 Azure Cosmos 容器表示。Each user can be represented by a distinct Azure Cosmos container.

  • 将 VM 群集或本地物理服务器中托管的 NoSQL 数据库(例如 MongoDB 或 Cassandra)迁移到 Azure Cosmos DB 时,在一组容器之间共享数据库的预配吞吐量非常有利。Sharing a database's provisioned throughput across a set of containers is useful when you migrate a NoSQL database, such as MongoDB or Cassandra, hosted on a cluster of VMs or from on-premises physical servers to Azure Cosmos DB. 可将针对 Azure Cosmos 数据库配置的预配吞吐量视为在逻辑上等同于(但更具成本效益和弹性)MongoDB 或 Cassandra 群集的计算容量。Think of the provisioned throughput configured on your Azure Cosmos database as a logical equivalent, but more cost-effective and elastic, to that of the compute capacity of your MongoDB or Cassandra cluster.

必须使用分区键创建在具有预配吞吐量的数据库内创建的所有容器。All containers created inside a database with provisioned throughput must be created with a partition key. 在任意给定的时间点,分配给数据库中容器的吞吐量将分布在该容器的所有逻辑分区中。At any given point in time, the throughput allocated to a container within a database is distributed across all the logical partitions of that container. 如果有容器共享在数据库上配置的预配吞吐量,则无法选择性地将吞吐量应用到特定的容器或逻辑分区。When you have containers that share provisioned throughput configured on a database, you can't selectively apply the throughput to a specific container or a logical partition.

如果逻辑分区上的工作负荷消耗的吞吐量超过了分配给特定逻辑分区的吞吐量,操作将受到速率限制。If the workload on a logical partition consumes more than the throughput that's allocated to a specific logical partition, your operations are rate-limited. 出现速率限制时,可以增大整个数据库的吞吐量,或重试操作。When rate-limiting occurs, you can either increase the throughput for the entire database or retry the operations. 有关分区的详细信息,请参阅逻辑分区For more information on partitioning, see Logical partitions.

对某个数据库预配的吞吐量可由该数据库内的容器共享。Throughput provisioned on a database can be shared by the containers within that database. 数据库级共享吞吐量中的每个新容器将需要 100 RU/秒。Each new container in database level shared throughput will require 100 RU/s. 预配包含共享数据库产品/服务的容器时:When you provision containers with shared database offering:

  • 每 25 个容器分组到一个分区集中,并且在分区集中的容器之间共享数据库吞吐量 (D)。Every 25 containers are grouped into a partition set and the database throughput(D) is shared between the containers in the partition set. 如果数据库中最多有 25 个容器,并且在任何时间点,如果你只使用一个容器,则该容器可使用的吞吐量为“D”吞吐量的最大值。If there are up to 25 containers in the database and at any point in time, if you are using only one container, then that container can use a max of 'D' throughput.

  • 对于在 25 个容器之后创建的每个新容器,将创建一个新的分区集,并在创建的新分区集之间分配数据库吞吐量(即 2 个分区集是 D/2,3 个分区集是 D/3…)。For every new container created after 25 containers, a new partition set is created and the database throughput is split between the new partition sets created (that is D/2 for 2 partition sets, D/3 for 3 partition sets…). 在任何时间点,如果仅使用数据库中的一个容器,则该容器可以分别使用(D/2, D/3, D/4…At any point in time, if you are using only one container from the database, it can use a max of (D/2, D/3, D/4… 吞吐量)的最大值。throughput) respectively. 鉴于吞吐量降低,建议你在一个数据库中创建的容器不超过 25 个。Given the reduced throughput, its recommended that you create no more than 25 containers in one database.

示例Example

  • 如果创建一个名为“MyDB”的数据库,其预配置吞吐量为 10000 RU/秒。If you create a database named "MyDB" with a provisioned throughput of 10K RU/s.

  • 如果在“MyDB”下预配 25 个容器,则所有容器都将分组到一个分区集中。If you provision 25 containers under "MyDB", then all the containers are grouped into a partition set. 在任何时间点,如果仅使用数据库中的一个容器,那么它最多可以使用 10000 RU/秒 (D)。At any point in time, if you are using only one container from the database, then it can use a maximum of 10K RU/s (D).

  • 预配第 26 个容器时,将创建一个新的分区集,并且吞吐量将在这两个分区集之间平均分配。When you provision 26th container, a new partition set is created and the throughput is split equally between both the partition sets. 因此,在任何时间点,如果仅使用数据库中的一个容器,那么它最多可以使用 5000 RU/秒 (D/2)。So at any point in time, if you are using only one container from the database it can use a maximum of 5K RU/s (D/2). 因为有两个分区集,所以吞吐量可共享性因子将分成 D/2。Because there are two partition sets, the throughput shareability factor is split into D/2.

    下图以图形方式演示了前面的示例:The following image demonstrates the previous example graphically:

    数据库级吞吐量的可共享性因子

如果工作负荷涉及到删除数据库中的所有集合并重新创建集合,则我们建议删除空数据库,再重新创建新的数据库,然后创建集合。If your workloads involve deleting and recreating all the collections in a database, it is recommended that you drop the empty database and recreate a new database prior to collection creation. 下图显示了物理分区如何托管属于数据库中不同容器的一个或多个逻辑分区:The following image shows how a physical partition can host one or more logical partitions that belong to different containers within a database:

物理分区

对数据库和容器设置吞吐量Set throughput on a database and a container

可以合并两个模型。You can combine the two models. 同时对数据库和容器预配吞吐量。Provisioning throughput on both the database and the container is allowed. 以下示例演示如何对 Azure Cosmos 数据库和容器预配吞吐量:The following example shows how to provision throughput on an Azure Cosmos database and a container:

  • 可以创建预配吞吐量为“K”RU 的名为“Z”的 Azure Cosmos 数据库。 You can create an Azure Cosmos database named Z with provisioned throughput of "K" RUs.

  • 接下来,在该数据库中创建名为 A、B、C、D、E 的五个容器。 Next, create five containers named A, B, C, D, and E within the database. 创建容器 B 时,请确保启用“为此容器预配专用吞吐量” 选项,并在此容器上显式配置“P” 个 RU 的预配吞吐量。When creating container B, make sure to enable Provision dedicated throughput for this container option and explicitly configure "P" RUs of provisioned throughput on this container. 请注意,只有在创建数据库和容器时,才能配置共享吞吐量和专用吞吐量。Note that you can configure shared and dedicated throughput only when creating the database and container.

    在容器级别设置吞吐量

  • “K”RU 吞吐量在 A、C、D、E 这四个容器之间共享。 提供给 A、C、D 或 E 的确切吞吐量各不相同。 The "K" RUs throughput is shared across the four containers A, C, D, and E. The exact amount of throughput available to A, C, D, or E varies. 每个容器的吞吐量没有 SLA。There are no SLAs for each individual container's throughput.

  • 保证名为 B 的容器始终可以获得“P”RU 吞吐量。 The container named B is guaranteed to get the "P" RUs throughput all the time. 该容器有 SLA 的保障。It's backed by SLAs.

Note

具有预配吞吐量的容器无法转换为共享的数据库容器。A container with provisioned throughput cannot be converted to shared database container. 反之,共享的数据库容器无法转换为具有专用吞吐量的容器。Conversely a shared database container cannot be converted to have a dedicated throughput.

在数据库或容器上更新吞吐量Update throughput on a database or a container

创建 Azure Cosmos 容器或数据库后,可以更新预配的吞吐量。After you create an Azure Cosmos container or a database, you can update the provisioned throughput. 对于可以在数据库或容器上配置的最大预配吞吐量,没有任何限制。There is no limit on the maximum provisioned throughput that you can configure on the database or the container. 最小预配吞吐量取决于以下因素:The minimum provisioned throughput depends on the following factors:

  • 曾经存储在容器中的最大数据大小The maximum data size that you ever store in the container
  • 曾经在容器上预配的最大吞吐量The maximum throughput that you ever provision on the container
  • 曾经在数据库中创建的具有共享吞吐量的 Azure Cosmos 容器的最大数目。The maximum number of Azure Cosmos containers that you ever create in a database with shared throughput.

可以通过 SDK 以编程方式检索容器或数据库的最小吞吐量,也可以在 Azure 门户中查看值。You can retrieve the minimum throughput of a container or a database programmatically by using the SDKs or view the value in the Azure portal. 使用 .NET SDK 时,可以通过 DocumentClient.ReplaceOfferAsync 方法缩放预配的吞吐量值。When using the .NET SDK, the DocumentClient.ReplaceOfferAsync method allows you to scale the provisioned throughput value. 使用 Java SDK 时,可以通过 RequestOptions.setOfferThroughput 方法缩放预配的吞吐量值。When using the Java SDK, the RequestOptions.setOfferThroughput method allows you to scale the provisioned throughput value.

使用 .NET SDK 时,可以通过 DocumentClient.ReadOfferAsync 方法检索容器或数据库的最小吞吐量。When using the .NET SDK, the DocumentClient.ReadOfferAsync method allows you to retrieve the minimum throughput of a container or a database.

可以随时缩放容器或数据库的预配吞吐量。You can scale the provisioned throughput of a container or a database at any time. 执行缩放操作以增加吞吐量时,由于系统任务的原因,可能需要更长的时间来预配所需的资源。When a scale operation is performed to increase the throughput, it can take longer time due to the system tasks to provision the required resources. 可以在 Azure 门户中检查缩放操作的状态,也可以使用 SDK 以编程方式检查。You can check the status of the scale operation in Azure portal or programmatically using the SDKs. 使用 .NET SDK 时,可以通过 DocumentClient.ReadOfferAsync 方法获取缩放操作的状态。When using the .NET SDK, you can get the status of the scale operation by using the DocumentClient.ReadOfferAsync method.

模型比较Comparison of models

参数Parameter 对数据库的预配吞吐量Throughput provisioned on a database 对容器预配的吞吐量Throughput provisioned on a container
最小 RU 数Minimum RUs 400(前四个容器之后的每个容器均需要至少每秒 100 RU 的吞吐量。)400 (After the first four containers, each additional container requires a minimum of 100 RUs per second.) 400400
每个容器的最小 RU 数Minimum RUs per container 100100 400400
最大 RU 数Maximum RUs 对于数据库无限。Unlimited, on the database. 对于容器无限。Unlimited, on the container.
分配或提供给特定容器的 RU 数RUs assigned or available to a specific container 无保证。No guarantees. 为给定容器分配的 RU 数取决于多种属性。RUs assigned to a given container depend on the properties. 属性可以是为共享吞吐量的容器选择的分区键、工作负荷的分布,以及容器的数量。Properties can be the choice of partition keys of containers that share the throughput, the distribution of the workload, and the number of containers. 对容器配置的所有 RU 专门保留给该容器使用。All the RUs configured on the container are exclusively reserved for the container.
容器的最大存储Maximum storage for a container 不受限制。Unlimited. 不受限制。Unlimited.
容器的每个逻辑分区的最大吞吐量Maximum throughput per logical partition of a container 10K RU10K RUs 10K RU10K RUs
容器的每个逻辑分区的最大存储(数据 + 索引)Maximum storage (data + index) per logical partition of a container 10 GB10 GB 10 GB10 GB

后续步骤Next steps