数据分区策略Data partitioning policy

分区策略定义了对于特定表,是否以及如何对盘区(数据分片)进行分区。The partitioning policy defines if and how Extents (data shards) should be partitioned, for a specific table.

该策略的主要目的是提高已知查询的性能,以缩小分区列中的值数据集,或对高基数字符串列进行聚合/联接。The main purpose of the policy is to improve performance of queries that are known to narrow the data set of values in the partitioned columns, or aggregate/join on a high cardinality string column. 该策略还可能导致更好的数据压缩。The policy may also result in better compression of the data.

注意

没有对可以定义策略的表数量设置硬编码限制。There are no hard-coded limits set on the number of tables that can have the policy defined on them. 但是,每个附加表都会增加在群集节点上运行的后台数据分区过程的开销。However, every additional table adds overhead to the background data partitioning process that runs on the cluster's nodes. 这可能会导致使用更多的群集资源。It may result in more clusters resources being used. 有关详细信息,请参阅监视容量For more information, see Monitoring and Capacity.

分区键Partition keys

支持以下类型的分区键。The following kinds of partition keys are supported.

种类Kind 列类型Column Type 分区属性Partition properties 分区值Partition value
哈希Hash string Function, MaxPartitionCount, SeedFunction, MaxPartitionCount, Seed Function(ColumnName, MaxPartitionCount, Seed)Function(ColumnName, MaxPartitionCount, Seed)
统一范围Uniform range datetime RangeSize, ReferenceRangeSize, Reference bin_at(ColumnName, RangeSize, Reference)bin_at(ColumnName, RangeSize, Reference)

哈希分区键Hash partition key

当大多数查询使用相等性筛选器(==in())时,或者当它们在大维度(基数为 10M 或更高)的特定 string 类型列(例如 application_IDtenant_IDuser_ID)上聚合/联接时,可以对表中的 string 类型列应用哈希分区键。Applying a hash partition key on a string-type column in a table is appropriate when the majority of queries use equality filters (==, in()) or when they aggregate/join on a specific string-typed column of large-dimension (cardinality of 10M or higher), such as an application_ID, a tenant_ID, or a user_ID.

  • 哈希取模函数用于对数据进行分区。A hash-modulo function is used to partition the data.
  • 属于同一分区的所有同类(已分区)盘区都分配给同一个数据节点。All homogeneous (partitioned) extents that belong to the same partition are assigned to the same data node.
  • 同类(已分区)盘区中的数据按哈希分区键排序。Data in homogeneous (partitioned) extents is ordered by the hash partition key.
    • 如果表上定义了哈希分区键,则不需要在行顺序策略中包含哈希分区键。You don't need to include the hash partition key in the row order policy, if one is defined on the table.
  • 如果查询使用无序策略,并且 joinsummarizemake-series 中使用的 shuffle key 是表的哈希分区键,那么查询的性能将会更好,因为需要跨群集节点移动的数据量显著减少。Queries that use the shuffle strategy, and in which the shuffle key used in join, summarize or make-series is the table's hash partition key, are expected to perform better, because the amount of data required to move across cluster nodes is significantly reduced.

分区属性Partition properties

  • Function 是要使用的哈希取模函数的名称。Function is the name of a hash-modulo function to use.
    • 支持的值:XxHash64Supported value: XxHash64.
  • MaxPartitionCount 是每个时间段要创建的最大分区数(哈希取模函数的取模参数)。MaxPartitionCount is the maximum number of partitions to create (the modulo argument to the hash-modulo function) per time period.
    • 支持的值范围为 (1,1024]Supported values are in the range (1,1024].
      • 该值预计为:The value is expected to be:
        • 大于群集中的节点数Larger than the number of nodes in the cluster
        • 小于列的基数。Smaller than the cardinality of the column.
      • 值越大,群集节点上数据分区过程的开销就越大,每个时间段的盘区数量也就越大。The higher the value is, the greater the overhead of the data partitioning process on the cluster's nodes, and the higher the number of extents for each time period.
      • 建议从值 256 开始。We recommend that you start with a value of 256.
        • 基于以上考虑因素,或者基于查询性能的优势与数据引入后分区的开销,根据需要调整该值。Adjust the value as needed, based on the above considerations, or based on the benefit in query performance vs. the overhead of partitioning the data post-ingestion.
  • Seed 是用于随机化哈希值的值。Seed is the value to use for randomizing the hash value.
    • 该值应为正整数。The value should be a positive integer.
    • 建议的值是 1,如果未指定,这是默认值。The recommended value is 1, which is the default, if unspecified.

示例Example

名为 tenant_idstring 类型列的哈希分区键。A hash partition key over a string-typed column named tenant_id. 它使用 XxHash64 哈希函数,其中 MaxPartitionCount256,默认的 Seed1It uses the XxHash64 hash function, with a MaxPartitionCount of 256, and the default Seed of 1.

{
  "ColumnName": "tenant_id",
  "Kind": "Hash",
  "Properties": {
    "Function": "XxHash64",
    "MaxPartitionCount": 256,
    "Seed": 1
  }
}

统一范围日期/时间分区键Uniform range datetime partition key

当引入到表中的数据不太可能根据该列排序时,可以对表中的 datetime 类型列应用统一范围日期/时间分区键。Applying a uniform range datetime partition key on a datetime-typed column in a table is appropriate when data ingested into the table is unlikely to be ordered according to this column. 在盘区之间重新组织数据,以便每个盘区最终都包括有限时间范围的记录,可能很有帮助。It can be helpful to reshuffle the data between extents so that each extent ends up including records from a limited time range. 重新组织会使 datetime 列上的筛选器在查询时更高效。Reshuffling results in the filters on the datetime column being more efficient at query time.

  • 使用的分区函数是 bin_at(),该函数不可自定义。The partition function used is bin_at() and isn't customizable.

分区属性Partition properties

  • RangeSize 是一个 timespan 标量常量,指示每个日期/时间分区的大小。RangeSize is a timespan scalar constant that indicates the size of each datetime partition. 建议:We recommended that you:
    • 从值 1.00:00:00(一天)开始。Start with the value 1.00:00:00 (one day).
    • 不要设置比这个值更短的值,因为这可能会导致表中有大量无法合并的小盘区。Don't set a value shorter than that, because it may result in the table having a large number of small extents that can't be merged.
  • Reference 是一个 datetime 标量常量,指示一个固定的时间点,根据该时间点调整日期/时间分区。Reference is a datetime scalar constant that indicates a fixed point in time, according to which datetime partitions are aligned.
    • 建议从 1970-01-01 00:00:00 开始。We recommend you start with 1970-01-01 00:00:00.
    • 如果存在日期/时间分区键具有 null 值的记录,则将其分区值设置为 ReferenceIf there are records in which the datetime partition key has null values, their partition value is set to the value of Reference.

示例Example

代码片段在名为 timestampdatetime 类型列上显示了统一日期/时间范围分区键。The code snippet shows a uniform datetime range partition key over a datetime typed column named timestamp. 它使用 datetime(1970-01-01) 作为参考点,每个分区的大小为 1dIt uses datetime(1970-01-01) as its reference point, with a size of 1d for each partition.

{
  "ColumnName": "timestamp",
  "Kind": "UniformRange",
  "Properties": {
    "Reference": "1970-01-01T00:00:00",
    "RangeSize": "1.00:00:00"
  }
}

策略对象The policy object

表的数据分区策略默认为 null,在这种情况下,将不会对表中的数据进行分区。By default, a table's data partitioning policy is null, in which case data in the table won't be partitioned.

数据分区策略具有以下主要属性:The data partitioning policy has the following main properties:

  • PartitionKeysPartitionKeys:

  • EffectiveDateTimeEffectiveDateTime:

    • 策略生效的 UTC 日期/时间。The UTC datetime from which the policy is effective.
    • 此属性是可选的。This property is optional. 如果未指定,则策略将对应用策略后引入的数据生效。If it isn't specified, then the policy will take effect on data ingested after the policy was applied.
    • 分区过程将忽略任何可能因为保留而被删除的任何非同类(非分区)盘区,因为它们的创建时间早于表的有效软删除时间的 90%。Any non-homogeneous (non-partitioned) extents that may be dropped because of retention, are ignored by the partitioning process, because their creation time precedes 90% of the table's effective soft-delete period.

示例Example

具有两个分区键的数据分区策略对象。Data partitioning policy object with two partition keys.

  1. 名为 tenant_idstring 类型列的哈希分区键。A hash partition key over a string-typed column named tenant_id.
    • 它使用 XxHash64 哈希函数,其中 MaxPartitionCount 为 256,默认的 Seed1It uses the XxHash64 hash function, with a MaxPartitionCount of 256, and the default Seed of 1.
  2. 名为 timestampdatetime 类型列的统一日期/时间范围分区键。A uniform datetime range partition key over a datetime type column named timestamp.
    • 它使用 datetime(1970-01-01) 作为参考点,每个分区的大小为 1dIt uses datetime(1970-01-01) as its reference point, with a size of 1d for each partition.
{
  "PartitionKeys": [
    {
      "ColumnName": "tenant_id",
      "Kind": "Hash",
      "Properties": {
        "Function": "XxHash64",
        "MaxPartitionCount": 256,
        "Seed": 1
      }
    },
    {
      "ColumnName": "timestamp",
      "Kind": "UniformRange",
      "Properties": {
        "Reference": "1970-01-01T00:00:00",
        "RangeSize": "1.00:00:00"
      }
    }
  ]
}

其他属性Additional properties

以下属性可以定义为策略的一部分,但是可选的,建议你不要更改它们。The following properties can be defined as part of the policy, but are optional and we recommend that you don't change them.

  • MinRowCountPerOperationMinRowCountPerOperation:

    • 单个数据分区操作的源盘区行数总和的最小目标值。Minimum target for the sum of row count of the source extents of a single data partitioning operation.
    • 此属性是可选的。This property is optional. 默认值为 0Its default value is 0.
  • MaxRowCountPerOperationMaxRowCountPerOperation:

    • 单个数据分区操作的源盘区行数总和的最大目标值。Maximum target for the sum of the row count of the source extents of a single data partitioning operation.
    • 此属性是可选的。This property is optional. 它的默认值是 0,默认目标是 5,000,000 条记录。Its default value is 0, with a default target of 5,000,000 records.
      • 可以考虑设置一个小于 5M 的值,因为分区操作在每个操作中消耗了大量的内存/CPU(请参见 #monitoring)。You can consider setting a value lower than 5M you see that partitioning operations consume a very large amount of memory / CPU, per operation (See #monitoring).

注释Notes

数据分区过程The data partitioning process

  • 数据分区在群集中作为引入后的后台进程运行。Data partitioning runs as a post-ingestion background process in the cluster.
    • 持续引入的表始终具有尚未分区的数据的“尾部”(非同类盘区)。A table that is continuously ingested into is expected to always have a "tail" of data that is yet to be partitioned (non-homogeneous extents).
  • 无论策略中 EffectiveDateTime 属性的值如何,数据分区只在热盘区上运行。Data partitioning runs only on hot extents, regardless of the value of the EffectiveDateTime property in the policy.
    • 如果需要对冷盘区进行分区,则需要临时调整缓存策略If partitioning cold extents is required, you will need to temporarily adjust the caching policy.

监视Monitoring

使用 .show diagnostics 命令监视群集中分区的进度或状态。Use the .show diagnostics command to monitor the progress or state of partitioning in a cluster.

.show diagnostics
| project MinPartitioningPercentageInSingleTable, TableWithMinPartitioningPercentage

输出包括:The output includes:

  • MinPartitioningPercentageInSingleTable:群集中具有数据分区策略的所有表中分区数据的最小百分比。MinPartitioningPercentageInSingleTable: The minimal percentage of partitioned data across all tables that have a data partitioning policy in the cluster.
    • 如果此百分比持续保持低于 90%,则评估群集的分区容量(请参阅容量)。If this percentage remains constantly under 90%, then evaluate the cluster's partitioning capacity (see capacity).
  • TableWithMinPartitioningPercentage:上面显示了分区百分比的表的完全限定名。TableWithMinPartitioningPercentage: The fully qualified name of the table whose partitioning percentage is shown above.

使用 .show 命令监视分区命令及其资源利用率。Use .show commands to monitor the partitioning commands and their resource utilization. 例如:For example:

.show commands
| where StartedOn > ago(1d)
| where CommandType == "ExtentsPartition"
| parse Text with ".partition async table " TableName " extents" *
| summarize count(), sum(TotalCpu), avg(tolong(ResourcesUtilization.MemoryPeak)) by TableName, bin(StartedOn, 15m)
| render timechart with(ysplit = panels)

容量Capacity

  • 数据分区过程会导致创建更多的盘区。The data partitioning process results in the creation of more extents. 群集可以逐渐增加其盘区合并容量,以便合并盘区的过程可以保持。The cluster may gradually increase its Extents merge capacity, so that the process of merging extents can keep up.
  • 如果有较高的引入吞吐量,或者有足够多的表定义了分区策略,则群集可能会逐渐增加其盘区分区容量,以便分区盘区的过程可以保持。If there's a high ingestion throughput, or a large enough number of tables that have a partitioning policy defined, then the cluster may gradually increase its Extents partition capacity, so that the process of partitioning extents can keep up.
  • 为了避免消耗过多资源,对这些动态增加进行了限制。To avoid consuming too many resources, these dynamic increases are capped. 如果它们全部耗尽,则可能需要逐渐线性地增加,以超过上限。You may be required to gradually and linearly increase them beyond the cap, if they're used up entirely.
    • 如果增加容量会显著增加群集资源的使用,则可以手动或通过启用自动缩放来横向/纵向扩展群集If increasing the capacities causes a significant increase in the use of the cluster's resources, you can scale the cluster up/out, either manually, or by enabling autoscale.

已分区列中的离群值Outliers in partitioned columns

  • 如果哈希分区键包含比其他值更普遍的值,例如,空字符串或泛型值(如 nullN/A),或者它们表示在数据集中更普遍的实体(例如 tenant_id),这可能会导致数据在群集节点之间分布不均,并降低查询性能。If a hash partition key includes values that are much more prevalent than others, for example, an empty string, or a generic value (such as null or N/A), or they represent an entity (such as tenant_id) that is more prevalent in the data set, that could contribute to imbalanced distribution of data across the cluster's nodes, and degrade query performance.
  • 如果统一范围日期/时间分区键包含的大部分值与列中的大多数值(例如,过去很久或将来的日期/时间值)“相差甚远”,那么这可能会增加数据分区过程的开销,并导致群集需要跟踪多个小盘区。If a uniform range datetime partition key has a large enough percentage of values that are "far" from the majority of the values in the column, for example, datetime values from the distant past or future, then that could increase the overhead of the data partitioning process, and lead to many small extents that the cluster will need to keep track of.

在这两种情况下,你应该要么“修复”数据,要么在数据引入之前或引入时筛选数据中不相关的记录,以减少群集上数据分区的开销。In both of these cases, you should either "fix" the data, or filter out any irrelevant records in the data before or at ingestion time, to reduce the overhead of the data partitioning on the cluster. 例如,使用更新策略For example, use an update policy).

后续步骤Next steps

使用分区策略控制命令管理表的数据分区策略。Use the partitioning policy control commands to manage data partitioning policies for tables.