合并策略Merge policy

合并策略定义是否应合并以及应如何合并 Kusto 群集中的盘区(数据分片)The merge policy defines if and how Extents (Data Shards) in the Kusto cluster should get merged.

合并操作有两种类型:Merge,此操作会重新生成索引;以及 Rebuild,此操作会完全重新引入数据。There are two types of merge operations: Merge, which rebuilds indexes, and Rebuild, which completely reingests the data.

这两种操作类型都会生成单个盘区来替换源盘区。Both operation types result in a single extent that replaces the source extents.

默认情况下,首选重新生成操作。By default, Rebuild operations are preferred. 如果某些盘区不符合进行重新生成的条件,则将尝试合并它们。If there are extents that don't fit the criteria for being rebuilt, an attempt will be made to merge them.

备注

  • 即使已设置合并策略,使用不同的 drop-by 标记来标记盘区也将导致此类盘区无法进行合并。Tagging extents using different drop-by tags will cause such extents to not be merged, even if a merge policy has been set. 有关详细信息,请参阅盘区标记For more information, see Extent Tagging.
  • 不会合并标记并集超过一百万个字符长度的盘区。Extents whose union of tags exceeds the length of 1M characters will not be merged.
  • 数据库或表的分片策略也会对盘区合并的方式产生一定的影响。The database's or table's Sharding policy also has some effect on how extents get merged.

合并策略属性Merge policy properties

合并策略包含以下属性:The merge policy contains the following properties:

  • RowCountUpperBoundForMergeRowCountUpperBoundForMerge :
    • 默认值:Defaults:
      • 对于 2020 年 6 月以前设置的策略,默认值为 0(无限制)。0 (unlimited) for policies that were set before June 2020.
      • 对于自 2020 年 6 月起设置的策略,默认值为 16,000,000。16,000,000 for policies that were set starting June 2020.
    • 合并盘区允许的最大行数。Maximum allowed row count of the merged extent.
    • 适用于合并操作,不适用于重新生成。Applies to Merge operations, not Rebuild.
  • OriginalSizeMBUpperBoundForMergeOriginalSizeMBUpperBoundForMerge :
    • 默认为 0(无限制)。Defaults to 0 (unlimited).
    • 合并盘区允许的最大原始大小 (MB)。Maximum allowed original size (in MBs) of the merged extent.
    • 适用于合并操作,不适用于重新生成。Applies to Merge operations, not Rebuild.
  • MaxExtentsToMergeMaxExtentsToMerge :
    • 默认为 100。Defaults to 100.
    • 允许在单个操作中合并的最大盘区数量。Maximum allowed number of extents to be merged in a single operation.
    • 适用于合并操作。Applies to Merge operations.
  • LoopPeriodLoopPeriod :
    • 默认为 01:00:00(1 小时)。Defaults to 01:00:00 (1 hour).
    • 数据管理服务在开始合并或重新生成操作的两次连续迭代之间等待的最长时间。The maximum time to wait between starting two consecutive iterations of merge or rebuild operations by the Data Management service.
    • 适用于合并和重新生成操作。Applies to both Merge and Rebuild operations.
  • AllowRebuildAllowRebuild :
    • 默认为“true”。Defaults to 'true'.
    • 定义是否已启用 Rebuild 操作(在这种情况下,其优先级高于 Merge 操作)。Defines whether Rebuild operations are enabled (in which case, they're preferred over Merge operations).
  • AllowMergeAllowMerge :
    • 默认为“true”。Defaults to 'true'.
    • 定义是否已启用 Merge 操作,在这种情况下,其优先级低于 Rebuild 操作。Defines whether Merge operations are enabled, in which case, they're less preferred than Rebuild operations.
  • MaxRangeInHoursMaxRangeInHours :
    • 默认为 8。Defaults to 8.
    • 最大允许的任意两个不同盘区的创建时间之间的差异(以小时为单位),如超过,将无法进行合并。Maximum allowed difference, in hours, between any two different extents' creation times, so that they can still be merged.
    • 时间戳源于盘区创建,且不与盘区中包含的实际数据相关。Timestamps are of extent creation, and don't relate to the actual data contained in the extents.
    • 适用于合并和重新生成操作。Applies to both Merge and Rebuild operations.
    • 应根据有效的保留策略“SoftDeletePeriod”或缓存策略“DataHotSpan”值来设置此值 。This value should be set according to the effective retention policy SoftDeletePeriod , or cache policy DataHotSpan values. 取 SoftDeletePeriod 和 DataHotSpan 中的较低值 。Take the lower value of SoftDeletePeriod and DataHotSpan. 将 MaxRangeInHours 值设置为在其 2-3% 之间。Set the MaxRangeInHours value to between 2-3% of it. 请参阅示例See the examples .

MaxRangeInHours 示例MaxRangeInHours examples

最小 [SoftDeletePeriod(保留策略),DataHotSpan(缓存策略)]min(SoftDeletePeriod (Retention Policy), DataHotSpan (Cache Policy)) 最大范围(合并策略)(以小时为单位)Max Range in hours (Merge Policy)
7 天(168 小时)7 days (168 hours) 44
14 天(336 小时)14 days (336 hours) 88
30 天(720 小时)30 days (720 hours) 1818
60 天(1,440 小时)60 days (1,440 hours) 3636
90 天(2,160 小时)90 days (2,160 hours) 6060
180 天(4,320 小时)180 days (4,320 hours) 120120
365 天(8,760 小时)365 days (8,760 hours) 250250

警告

更改盘区合并策略前,请咨询 Azure 数据资源管理器团队。Consult with the Azure Data Explorer team before altering an extents merge policy.

创建数据库时,将使用上面提到的默认合并策略值设置该数据库。When a database is created, it's set with the default merge policy values mentioned above. 默认情况下,在数据库中创建的所有表都将继承该策略,除非其策略在表级别被显式替代。The policy is by default inherited by all tables created in the database, unless their policies are explicitly overridden at table-level.

有关详细信息,请参阅可用于管理数据库或表合并策略的控制命令For more information, see control commands that allow you to manage merge policies for databases or tables.