针对数据修改的设计Design for data modification

本文重点介绍优化插入、更新和删除的设计注意事项。This article focuses on the design considerations for optimizing inserts, updates, and deletes. 在某些情况下,需要在针对查询优化的设计与针对数据修改优化的设计之间进行权衡,就像你在设计关系数据库时要做的那样(尽管在关系数据库中,管理设计权衡的方法是不同的)。In some cases, you will need to evaluate the trade-off between designs that optimize for querying against designs that optimize for data modification just as you do in designs for relational databases (although the techniques for managing the design trade-offs are different in a relational database). “表设计模式”部分介绍了一些详细的表服务设计模式,着重阐释了其中的部分权衡。The section Table Design Patterns describes some detailed design patterns for the Table service and highlights some these trade-offs. 在实践中,会发现许多针对查询实体优化的设计对于修改实体也能很好地工作。In practice, you will find that many designs optimized for querying entities also work well for modifying entities.

优化插入、更新和删除操作的性能Optimize the performance of insert, update, and delete operations

若要更新或删除某个实体,必须可使用 PartitionKeyRowKey 值确定该实体。To update or delete an entity, you must be able to identify it by using the PartitionKey and RowKey values. 就此点而言,由于希望尽可能高效地标识实体,因此选用于修改实体的 PartitionKeyRowKey 应遵循为支持点查询而选择的类似条件。In this respect, your choice of PartitionKey and RowKey for modifying entities should follow similar criteria to your choice to support point queries because you want to identify entities as efficiently as possible. 为了发现更新/删除实体所需的 PartitionKey and RowKey 值,不会希望查找实体时使用效率低下的分区或表扫描方式。You do not want to use an inefficient partition or table scan to locate an entity in order to discover the PartitionKey and RowKey values you need to update or delete it.

“表设计模式”部分中的以下模式解决了优化性能或插入、更新和删除操作的问题:The following patterns in the section Table design patterns address optimizing the performance or your insert, update, and delete operations:

  • 高容量删除模式 - 通过将要同时删除的所有实体存储在各自单独的表中,可删除大量实体;通过删除表来删除这些实体。High volume delete pattern - Enable the deletion of a high volume of entities by storing all the entities for simultaneous deletion in their own separate table; you delete the entities by deleting the table.
  • 数据系列模式 - 将完整的数据系列存储在单个实体中,最大限度地减少发出请求的次数。Data series pattern - Store complete data series in a single entity to minimize the number of requests you make.
  • 宽实体模式 - 通过多个物理实体存储属性数超过 252 个的逻辑实体。Wide entities pattern - Use multiple physical entities to store logical entities with more than 252 properties.
  • 大实体模式 - 通过 Blob 存储,存储较大属性值。Large entities pattern - Use blob storage to store large property values.

确保存储实体中的一致性Ensure consistency in your stored entities

影响你选择用于优化数据修改的键的其他关键因素是如何通过使用原子事务来确保一致性。The other key factor that influences your choice of keys for optimizing data modifications is how to ensure consistency by using atomic transactions. 只能使用 EGT 作用于存储在同一个分区中的实体。You can only use an EGT to operate on entities stored in the same partition.

表设计模式一文的以下模式解决了管理一致性的问题:The following patterns in the article Table design patterns address managing consistency:

  • 内分区的第二索引模式 - 利用同一分区中的 RowKey 值存储每个实体的多个副本,实现快速、高效的查询,并借助不同的 RowKey 值替换排序顺序。Intra-partition secondary index pattern - Store multiple copies of each entity using different RowKey values (in the same partition) to enable fast and efficient lookups and alternate sort orders by using different RowKey values.
  • 内分区的第二索引模式 - 在单独分区/表格中利用不同 RowKey 值存储每个实体的多个副本,实现快速高效的查找,并借助 RowKey 值替换排序顺序。Inter-partition secondary index pattern - Store multiple copies of each entity using different RowKey values in separate partitions or in separate tables to enable fast and efficient lookups and alternate sort orders by using different RowKey values.
  • 最终一致性事务模式 - 使用 Azure 队列,使不同分区边界或存储系统中的行为达到最终一致。Eventually consistent transactions pattern - Enable eventually consistent behavior across partition boundaries or storage system boundaries by using Azure queues.
  • 索引实体模式 - 维护索引实体,实现返回实体列表的高效搜索。Index entities pattern - Maintain index entities to enable efficient searches that return lists of entities.
  • 反规范模式 - 将相关数据组合放在单个实体中,使用户可通过单个点查询检索全部所需数据。Denormalization pattern - Combine related data together in a single entity to enable you to retrieve all the data you need with a single point query.
  • 数据系列模式 - 将完整的数据系列存储在单个实体中,最大限度地减少发出请求的次数。Data series pattern - Store complete data series in a single entity to minimize the number of requests you make.

有关实体组事务的信息,请参阅实体组事务部分。For information about entity group transactions, see the section Entity group transactions.

确保用于高效修改的设计便于高效查询Ensure your design for efficient modifications facilitates efficient queries

在许多情况下,用于高效查询的设计会产生高效修改的效果,但你应始终评估这是否适用于特定方案。In many cases, a design for efficient querying results in efficient modifications, but you should always evaluate whether this is the case for your specific scenario. 表设计模式一文中的某些模式明确评估了查询实体和修改实体之间的折衷方案,你应该始终考虑每种操作的数量。Some of the patterns in the article Table Design Patterns explicitly evaluate trade-offs between querying and modifying entities, and you should always take into account the number of each type of operation.

表设计模式一文中的以下模式针对的是设计实现高效查询和设计实现高效数据修改之间的折衷方案:The following patterns in the article Table design patterns address trade-offs between designing for efficient queries and designing for efficient data modification:

  • 复合键模式 - 通过复合 RowKey 值,客户端可使用单个点查询查找相关数据。Compound key pattern - Use compound RowKey values to enable a client to lookup related data with a single point query.
  • 日志结尾模式 - 利用按日期和时间倒序方式排序的 RowKey 值,检索最近添加到分区中的 n 个实体。Log tail pattern - Retrieve the n entities most recently added to a partition by using a RowKey value that sorts in reverse date and time order.