优化 Azure Cosmos DB 中的读取和写入成本Optimize reads and writes cost in Azure Cosmos DB

本文介绍如何计算从 Azure Cosmos DB 读取和写入数据所需的成本。This article describes how the cost required to read and write data from Azure Cosmos DB is calculated. 读取操作包括对项的 Get 操作;写入操作包括项的插入、替换、删除和 upsert。Read operations include get operations on items and write operations include insert, replace, delete, and upsert of items.

读取和写入成本Cost of reads and writes

Azure Cosmos DB 通过使用预配置吞吐量模型,在吞吐量和延迟方面保证可预测的性能。Azure Cosmos DB guarantees predictable performance in terms of throughput and latency by using a provisioned throughput model. 预配的吞吐量以每秒请求单位或 RU/秒表示。The throughput provisioned is represented in terms of Request Units per second, or RU/s. 请求单位 (RU) 是对计算 CPU、内存、IO 等资源的逻辑抽象,这是执行请求所必需的。A Request Unit (RU) is a logical abstraction over compute resources such as CPU, memory, IO, etc. that are required to perform a request. 预留的预配吞吐量 (RU) 专用于容器或数据库,以提供可预测的吞吐量和延迟。The provisioned throughput (RUs) is set aside and dedicated to your container or database to provide predictable throughput and latency. 预配置吞吐量使 Azure Cosmos DB 能够以任何规模提供可预测且一致的性能,保证低延迟和高可用性。Provisioned throughput enables Azure Cosmos DB to provide predictable and consistent performance, guaranteed low latency, and high availability at any scale. 请求单位表示规范化货币,简化了对应用程序需要多少资源的推理。Request units represent the normalized currency that simplifies the reasoning about how many resources an application needs.

无需考虑在读取和写入之间区分请求单位。You don't have to think about differentiating request units between reads and writes. 请求单位使用统一的货币模型,可交替使用相同的吞吐量容量进行读取和写入,提高了工作效率。The unified currency model of request units creates efficiencies to interchangeably use the same throughput capacity for both reads and writes. 下表以 RU/秒表示 1 KB 和 100 KB 大小项的读取和写入成本。The following table shows the cost of reads and writes in terms of RU/s for items that are 1 KB and 100 KB in size.

项大小Item Size 一次读取成本Cost of one read 一次写入成本Cost of one write
1 KB1 KB 1 RU1 RU 5 RU5 RUs
100 KB100 KB 10 RU10 RUs 50 RU50 RUs

读取 1 KB 大小的项花费一个 RU。Reading an item that is 1 KB in size costs one RU. 写入 1 KB 的项花费五个 RU。Writing an item that is 1-KB costs five RUs. 使用默认会话一致性级别时,会同时计算读取和写入成本。The read and write costs are applicable when using the default session consistency level. RU 的考虑因素包括:项大小、属性计数、数据一致性、索引属性、索引和查询模式。The considerations around RUs include: item size, property count, data consistency, indexed properties, indexing, and query patterns.

100 万次读取和写入的标准化成本Normalized cost for 1 million reads and writes

预配 1,000 RU/秒转换为 360 万 RU/小时,每小时费用为 0.82 元(Azure 中国)。Provisioning 1,000 RU/s translates to 3.6 million RU/hour and will cost CNY0.82 for the hour (in Azure China). 对于 1 KB 的项,可使用此预配吞吐量每小时执行 360 万次读取或 72 万次写入(此值计算方式为:3.6 million RU / 5)。For a 1-KB item, you can perform 3.6 million reads or 0.72 million writes, (this value is calculated as: 3.6 million RU / 5) per hour with this provisioned throughput. 规范化为百万次读取和写入后,成本将是 0.228 元/百万次读取(此值的计算如下:0.82 元/3.6 百万)和 1.14 元/百万次写入(此值的计算如下:0.82 元/0.72 百万)。Normalized to a million reads and writes, the cost would be CNY0.228 for 1 million reads (this value is calculated as: CNY0.82/3.6 million) and CNY1.14 for 1 million writes (this value is calculated as: CNY0.82/0.72 million).

区域数量和请求单位成本Number of regions and the request units cost

无论与 Azure Cosmos 帐户关联的区域数量如何,写入成本都是恒定的。The cost of writes is constant irrespective of the number of regions associated with the Azure Cosmos account. 换句话说,1 KB 写入将花费五个 RU,而与该帐户关联的区域数量无关。In other words, a 1 KB write will cost five RUs independent of the number of regions that are associated with the account. 在复制、接受和处理每个区域的复制流量方面,花费了大量的资源。There's a non-trivial amount of resources spent in replicating, accepting, and processing the replication traffic on every region. 有关多区域成本优化的详细信息,请参阅优化多区域 Cosmos 帐户成本一文。For details about multi-region cost optimization, see Optimizing the cost of multi-region Cosmos accounts article.

优化写入和读取成本Optimize the cost of writes and reads

执行写入操作时,应预配足够的容量以支持每秒所需的写入次数。When you perform write operations, you should provision enough capacity to support the number of writes needed per second. 可在执行写入之前使用 SDK、门户网站、CLI 来增加预配置的吞吐量,然后在写入完成后降低吞吐量。You can increase the provisioned throughput by using SDK, portal, CLI before performing the writes and then reduce the throughput after the writes are completed. 写入期间的吞吐量是给定数据所需的最小吞吐量,加上插入工作负荷所需的吞吐量(假设没有其他工作负荷正在运行)。Your throughput for the write period is the minimum throughput needed for the given data, plus the throughput required for insert workload assuming no other workloads are running.

如果同时运行其他工作负荷(例如,查询/读取/更新/删除),则还应添加执行这些操作所需的其他请求单位。If you are running other workloads concurrently, for example, query/read/update/delete, you should add the additional request units required for those operations too. 如果写入操作速率受到限制,可使用 Azure Cosmos DB SDK 自定义重试/回退策略。If the write operations are rate-limited, you can customize the retry/backoff policy by using Azure Cosmos DB SDKs. 例如,可以增加负载,直到一小部分请求的速率受到限制。For instance, you can increase the load until a small rate of requests gets rate-limited. 如果发生速率限制,则客户端应用程序应在指定重试间隔的速率限制请求时退出。If rate-limit occurs, the client application should back off on rate-limiting requests for the specified retry interval. 重试写入之前,重试之间应有最小的时间间隔。Before retrying writes, you should have a minimal amount of time gap between retries. 重试策略支持包含在 SQL.NET、Java、Node.js 和 Python SDK 和所有受支持的 .NET Core SDK 版本中。Retry policy support is included in SQL .NET, Java, Node.js, and Python SDKs and all the supported versions of the .NET Core SDKs.

此外,还可使用 Azure 数据工厂将数据批量插入 Azure Cosmos DB 或将数据从任何受支持的源数据存储复制到 Azure Cosmos DB。You can also bulk insert data into Azure Cosmos DB or copy data from any supported source data store to Azure Cosmos DB by using Azure Data Factory. Azure 数据工厂与 Azure Cosmos DB Bulk API 集成,以便在写入数据时提供最佳性能。Azure Data Factory natively integrates with the Azure Cosmos DB Bulk API to provide the best performance, when you write data.

后续步骤Next steps

接下来,可通过以下文章详细了解 Azure Cosmos DB 中的成本优化:Next you can proceed to learn more about cost optimization in Azure Cosmos DB with the following articles: