优化 Azure Cosmos DB 中的读取和写入成本Optimize reads and writes cost in Azure Cosmos DB

本文介绍如何计算从 Azure Cosmos DB 读取和写入数据所需的成本。This article describes how the cost required to read and write data from Azure Cosmos DB is calculated. 读取操作包括点读取和查询Read operations include point reads and queries. 写入操作包括插入、替换、删除和更新插入项。Write operations include insert, replace, delete, and upsert of items.

读取和写入成本Cost of reads and writes

Azure Cosmos DB 通过使用预配置吞吐量模型,在吞吐量和延迟方面保证可预测的性能。Azure Cosmos DB guarantees predictable performance in terms of throughput and latency by using a provisioned throughput model. 预配的吞吐量以每秒请求单位或 RU/秒表示。The throughput provisioned is represented in terms of Request Units per second, or RU/s. 请求单位 (RU) 是对计算 CPU、内存、IO 等资源的逻辑抽象,这是执行请求所必需的。A Request Unit (RU) is a logical abstraction over compute resources such as CPU, memory, IO, etc. that are required to perform a request. 预留的预配吞吐量 (RU) 专用于容器或数据库,以提供可预测的吞吐量和延迟。The provisioned throughput (RUs) is set aside and dedicated to your container or database to provide predictable throughput and latency. 预配置吞吐量使 Azure Cosmos DB 能够以任何规模提供可预测且一致的性能,保证低延迟和高可用性。Provisioned throughput enables Azure Cosmos DB to provide predictable and consistent performance, guaranteed low latency, and high availability at any scale. 请求单位表示规范化货币,简化了对应用程序需要多少资源的推理。Request units represent the normalized currency that simplifies the reasoning about how many resources an application needs.

无需考虑在读取和写入之间区分请求单位。You don't have to think about differentiating request units between reads and writes. 请求单位使用统一的货币模型,可交替使用相同的吞吐量容量进行读取和写入,提高了工作效率。The unified currency model of request units creates efficiencies to interchangeably use the same throughput capacity for both reads and writes. 下表以 RU/秒为单位表示 1 KB 和 100 KB 大小的项的点读取和写入成本。The following table shows the cost of point reads and writes in terms of RU/s for items that are 1 KB and 100 KB in size.

项大小Item Size 一次点读取成本Cost of one point read 一次写入成本Cost of one write
1 KB1 KB 1 RU1 RU 5 RU5 RUs
100 KB100 KB 10 RU10 RUs 50 RU50 RUs

点读取 1 KB 大小的项花费一个 RU。Doing a point read for an item that is 1 KB in size costs one RU. 写入 1 KB 的项花费五个 RU。Writing an item that is 1-KB costs five RUs. 使用默认会话一致性级别时,会同时计算读取和写入成本。The read and write costs are applicable when using the default session consistency level. RU 的考虑因素包括:项大小、属性计数、数据一致性、索引属性、索引和查询模式。The considerations around RUs include: item size, property count, data consistency, indexed properties, indexing, and query patterns.

点读取消耗的 RU 显著少于查询。Point reads cost significantly fewer RU's than queries. 与查询不同,点读取不需要使用查询引擎来访问数据,因而可节省 RU。Point reads, unlike queries, don't need to use the query engine to access data can save RU's. 查询 RU 开销取决于查询的复杂性和查询引擎需要加载的项数。Query RU charge depends on the complexity of the query and the number of items that the query engine needed to load.

优化写入和读取成本Optimize the cost of writes and reads

执行写入操作时,应预配足够的容量以支持每秒所需的写入次数。When you perform write operations, you should provision enough capacity to support the number of writes needed per second. 可在执行写入之前使用 SDK、门户网站、CLI 来增加预配置的吞吐量,然后在写入完成后降低吞吐量。You can increase the provisioned throughput by using SDK, portal, CLI before performing the writes and then reduce the throughput after the writes are completed. 写入期间的吞吐量是给定数据所需的最小吞吐量,加上插入工作负荷所需的吞吐量(假设没有其他工作负荷正在运行)。Your throughput for the write period is the minimum throughput needed for the given data, plus the throughput required for insert workload assuming no other workloads are running.

如果同时运行其他工作负荷(例如,查询/读取/更新/删除),则还应添加执行这些操作所需的其他请求单位。If you are running other workloads concurrently, for example, query/read/update/delete, you should add the additional request units required for those operations too. 如果写入操作速率受到限制,可使用 Azure Cosmos DB SDK 自定义重试/回退策略。If the write operations are rate-limited, you can customize the retry/backoff policy by using Azure Cosmos DB SDKs. 例如,可以增加负载,直到一小部分请求的速率受到限制。For instance, you can increase the load until a small rate of requests gets rate-limited. 如果发生速率限制,则客户端应用程序应在指定重试间隔的速率限制请求时退出。If rate-limit occurs, the client application should back off on rate-limiting requests for the specified retry interval. 重试写入之前,重试之间应有最小的时间间隔。Before retrying writes, you should have a minimal amount of time gap between retries. 重试策略支持包含在 SQL.NET、Java、Node.js 和 Python SDK 和所有受支持的 .NET Core SDK 版本中。Retry policy support is included in SQL .NET, Java, Node.js, and Python SDKs and all the supported versions of the .NET Core SDKs.

此外,还可使用 Azure 数据工厂将数据批量插入 Azure Cosmos DB 或将数据从任何受支持的源数据存储复制到 Azure Cosmos DB。You can also bulk insert data into Azure Cosmos DB or copy data from any supported source data store to Azure Cosmos DB by using Azure Data Factory. Azure 数据工厂与 Azure Cosmos DB Bulk API 集成,以便在写入数据时提供最佳性能。Azure Data Factory natively integrates with the Azure Cosmos DB Bulk API to provide the best performance, when you write data.

后续步骤Next steps

接下来,可通过以下文章详细了解 Azure Cosmos DB 中的成本优化:Next you can proceed to learn more about cost optimization in Azure Cosmos DB with the following articles: