优化 Azure Cosmos DB 中的存储成本Optimize storage cost in Azure Cosmos DB

适用于: SQL API Cassandra API Gremlin API 表 API Azure Cosmos DB API for MongoDB

Azure Cosmos DB 提供无限的存储和吞吐量。Azure Cosmos DB offers unlimited storage and throughput. 与必须在 Azure Cosmos 容器或数据库上预配/配置的吞吐量不同,存储将根据使用情况进行计费。Unlike throughput, which you have to provision/configure on your Azure Cosmos containers or databases, the storage is billed based on a consumption basis. 仅需为使用的逻辑存储计费,并且不必提前预留任何存储空间。You are billed only for the logical storage you consume and you don't have to reserve any storage in advance. 存储会根据在 Azure Cosmos 容器中添加或删除的数据自动纵向扩展和缩减。Storage automatically scales up and down based on the data that you add or remove to an Azure Cosmos container.

存储成本Storage cost

存储的计费单位为 GB。Storage is billed with the unit of GBs. 数据和索引使用本地 SSD 支持的存储。Local SSD-backed storage is used by your data and indexing. 使用的总存储量等于使用 Azure Cosmos DB 的所有区域中使用的数据和索引所需的存储。The total storage used is equal to the storage required by the data and indexes used across all the regions where you are using Azure Cosmos DB. 如果跨三个区域对 Azure Cosmos 帐户进行多区域复制,则将为这三个区域中的每个区域支付总存储成本。If you multiple-regionally replicate an Azure Cosmos account across three regions, you will pay for the total storage cost in each of those three regions. 要估算存储要求,请参阅容量规划器工具。To estimate your storage requirement, see capacity planner tool. Azure Cosmos DB 中的存储成本为 2.576 元/GB/月。有关最新更新,请参阅定价页面The cost for storage in Azure Cosmos DB is CNY 2.576 GB/month, see Pricing page for latest updates. 可以设置警报以确定 Azure Cosmos 容器使用的存储,要监视存储,请参阅监视 Azure Cosmos DB 一文。You can set up alerts to determine storage used by your Azure Cosmos container, to monitor your storage, see Monitor Azure Cosmos DB) article.

通过项目大小优化成本Optimize cost with item size

Azure Cosmos DB 希望项目大小不超过 2 MB,以获得最佳性能和成本优势。Azure Cosmos DB expects the item size to be 2 MB or less for optimal performance and cost benefits. 如果需要任何项目来存储大于 2 MB 的数据,请考虑重新设计项目架构。If you need any item to store larger than 2 MB of data, consider redesigning the item schema. 对于极少数无法重新设计架构的情况,可将项目拆分为子项目,并使用公共标识符 (ID) 从逻辑上链接它们。In the rare event that you cannot redesign the schema, you can split the item into subitems and link them logically with a common identifier(ID). 所有 Azure Cosmos DB 功能都通过定位到该逻辑标识符来一致地工作。All the Azure Cosmos DB features work consistently by anchoring to that logical identifier.

通过索引优化成本Optimize cost with indexing

默认情况下,数据会自动编入索引,这会增加所消耗的总存储量。By default, the data is automatically indexed, which can increase the total storage consumed. 但是,可应用自定义索引策略来减少此开销。However, you can apply custom index policies to reduce this overhead. 尚未通过策略调整的自动索引约为项目大小的 10-20%。Automatic indexing that has not been tuned through policy is about 10-20% of the item size. 通过删除或自定义索引策略,无需为写入支付额外成本,也不需要额外的吞吐量容量。By removing or customizing index policies, you don't pay extra cost for writes and don't require additional throughput capacity. 请参阅 Azure Cosmos DB 中的索引编制以配置自定义索引策略。See Indexing in Azure Cosmos DB to configure custom indexing policies. 如果你曾使用过关系数据库,你可能会认为“对所有内容编制索引”意味着存储成本会翻倍或更高。If you have worked with relational databases before, you may think that "index everything" means doubling of storage or higher. 但是,在 Azure Cosmos DB 中,在中值情况下,存储成本实际要低得多。However, in Azure Cosmos DB, in the median case, it's much lower. 在 Azure Cosmos DB 中,即使使用自动索引,索引的存储开销通常也很低 (10-20%),因为它专为低存储容量而设计。In Azure Cosmos DB, the storage overhead of index is typically low (10-20%) even with automatic indexing, because it is designed for a low storage footprint. 通过管理索引策略,你可以以更细粒度的方式控制索引占用空间和查询性能之间的权衡。By managing the indexing policy, you can control the tradeoff of index footprint and query performance in a more fine-grained manner.

使用生存时间和更改源优化成本Optimize cost with time to live and change feed

一旦你不再需要数据,就可使用生存时间更改源从 Azure Cosmos 帐户中正常删除数据,或者可将旧数据迁移到其他数据存储,例如 Azure blob 存储或 Azure 数据仓库。Once you no longer need the data you can gracefully delete it from your Azure Cosmos account by using time to live, change feed or you can migrate the old data to another data store such as Azure blob storage or Azure data warehouse. 利用生存时间或 TTL,Azure Cosmos DB 能够在特定一段时间后自动将项从容器中删除。With time to live or TTL, Azure Cosmos DB provides the ability to delete items automatically from a container after a certain time period. 默认情况下,可以在容器级别设置生存时间,并基于每个项替代该值。By default, you can set time to live at the container level and override the value on a per-item basis. 在容器或项级别设置 TTL 后,Azure Cosmos DB 会在一段时间(自上次修改项的时间开始算起)后自动删除这些项。After you set the TTL at a container or at an item level, Azure Cosmos DB will automatically remove these items after the time period since the time they were last modified. 通过使用更改源,可将数据迁移到 Azure Cosmos DB 中的另一个容器或外部数据存储。By using change feed, you can migrate data to either another container in Azure Cosmos DB or to an external data store. 迁移需要零停机时间,当完成迁移时,可删除或配置生存期以删除源 Azure Cosmos 容器。The migration takes zero down time and when you are done migrating, you can either delete or configure time to live to delete the source Azure Cosmos container.

使用丰富的媒体数据类型优化成本Optimize cost with rich media data types

如果希望存储丰富的媒体类型,例如视频、图像等,可在 Azure Cosmos DB 中使用许多选项。If you want to store rich media types, for example, videos, images, etc., you have a number of options in Azure Cosmos DB. 一种选项是将这些富媒体类型存储为 Azure Cosmos 项。One option is to store these rich media types as Azure Cosmos items. 每个项都有一个 2 MB 的限制,可通过将数据项链接到多个子项来避免此限制。There is a 2-MB limit per item, and you can avoid this limit by chaining the data item into multiple subitems. 也可将它们存储在 Azure Blob 存储中,并使用元数据从 Azure Cosmos 项中引用它们。Or you can store them in Azure Blob storage and use the metadata to reference them from your Azure Cosmos items. 这种方法有许多优点和缺点。There are a number of pros and cons with this approach. 除常规 Azure Cosmos 项目外,第一种方法在延迟、吞吐量 SLA 以及富媒体数据类型的统包多区域分发功能方面性能最佳。The first approach gets you the best performance in terms of latency, throughput SLAs plus turnkey multiple-region distribution capabilities for the rich media data types in addition to your regular Azure Cosmos items. 但是,这种支持的价格更高。However the support is available at a higher price. 通过在 Azure Blob 存储中存储媒体,可降低总体成本。By storing media in Azure Blob storage, you could lower your overall costs. 如果延迟非常重要,可对来自 Azure Cosmos 项的富媒体文件使用高级存储。If latency is critical, you could use premium storage for rich media files that are referenced from Azure Cosmos items. 该存储与 CDN 本机集成,以较低的成本从边缘服务器提供图像,以规避地理限制。This integrates natively with CDN to serve images from the edge server at lower cost to circumvent geo-restriction. 此方案的缺点是,必须处理 Azure Cosmos DB 和 Azure Blob 存储这两种服务,这可能会增加运营成本。The down side with this scenario is that you have to deal with two services - Azure Cosmos DB and Azure Blob storage, which may increase operational costs.

检查使用的存储Check storage consumed

要检查 Azure Cosmos 容器的存储消耗情况,可以在容器上运行 HEAD 或 GET请求,并检查 x-ms-request-quotax-ms-request-usage 标头。To check the storage consumption of an Azure Cosmos container, you can run a HEAD or GET request on the container, and inspect the x-ms-request-quota and the x-ms-request-usage headers. 或者,如果使用 .NET SDK,可使用 DocumentSizeQuotaDocumentSizeUsage 属性来使用存储空间。Alternatively, when working with the .NET SDK, you can use the DocumentSizeQuota, and DocumentSizeUsage properties to get the storage consumed.

使用 SDKUsing SDK

// Measure the item size usage (which includes the index size)
ResourceResponse<DocumentCollection> collectionInfo = await client.ReadDocumentCollectionAsync(UriFactory.CreateDocumentCollectionUri("db", "coll"));   

Console.WriteLine("Item size quota: {0}, usage: {1}", collectionInfo.DocumentQuota, collectionInfo.DocumentUsage);

后续步骤Next steps

接下来,可通过以下文章详细了解 Azure Cosmos DB 中的成本优化:Next you can proceed to learn more about cost optimization in Azure Cosmos DB with the following articles: