通过 Azure 流分析提高 Azure SQL 数据库的吞吐量性能Increase throughput performance to Azure SQL Database from Azure Stream Analytics

本文提供有关在使用 Azure 流分析将数据载入 Azure SQL 数据库时,如何提高写入吞吐量性能的提示。This article discusses tips to achieve better write throughput performance when you're loading data into Azure SQL Database using Azure Stream Analytics.

Azure 流分析中的 SQL 输出支持使用并行写入作为一个选项。SQL output in Azure Stream Analytics supports writing in parallel as an option. 此选项允许完全并行作业拓扑,其中,多个输出分区将并行写入到目标表。This option allows for fully parallel job topologies, where multiple output partitions are writing to the destination table in parallel. 但是,在 Azure 流分析中启用此选项可能并不足以提高吞吐量,因为此选项严重依赖于 Azure SQL 数据库配置和表架构。Enabling this option in Azure Stream Analytics however may not be sufficient to achieve higher throughputs, as it depends significantly on your database configuration and table schema. 选择的索引、群集键、索引填充因子和压缩都会对加载表所需的时间产生影响。The choice of indexes, clustering key, index fill factor, and compression have an impact on the time to load tables. 有关如何基于内部基准优化 Azure SQL 数据库以提高查询和加载性能的详细信息,请参阅 SQL 数据库性能指南For more information about how to optimize your database to improve query and load performance based on internal benchmarks, see SQL Database performance guidance. 与 SQL 数据库并行编写时,无法保证写入顺序。Ordering of writes is not guaranteed when writing in parallel to SQL Database.

下面是每个服务中一些可以帮助提高解决方案整体吞吐量的配置。Here are some configurations within each service that can help improve overall throughput of your solution.

Azure 流分析Azure Stream Analytics

  • 继承分区 - 使用此 SQL 输出配置选项可以继承先前查询步骤或输入的分区方案。Inherit Partitioning - This SQL output configuration option enables inheriting the partitioning scheme of your previous query step or input. 启用此选项后,写入到基于磁盘的表以及对作业使用完全并行拓扑时,吞吐量预期会提高。With this enabled, writing to a disk-based table and having a fully parallel topology for your job, expect to see better throughputs. 其他许多输出已自动采用此分区。This partitioning already automatically happens for many other outputs. 使用此选项执行批量插入时,还会禁用表锁定 (TABLOCK)。Table locking (TABLOCK) is also disabled for bulk inserts made with this option.


如果输入分区超过 8 个,则继承输入分区方案可能不是适当的选择。When there are more than 8 input partitions, inheriting the input partitioning scheme might not be an appropriate choice. 包含单个标识列和聚集索引的表曾经达到过此上限。This upper limit was observed on a table with a single identity column and a clustered index. 在这种情况下,请考虑在查询中使用 INTO 8 来显式指定输出写入器的数量。In this case, consider using INTO 8 in your query, to explicitly specify the number of output writers. 根据架构和选择的索引,观察结果可能有所不同。Based on your schema and choice of indexes, your observations may vary.

  • 批大小 - 使用 SQL 输出配置可以根据目标表/工作负荷的性质,在 Azure 流分析 SQL 输出中指定最大批大小。Batch Size - SQL output configuration allows you to specify the maximum batch size in an Azure Stream Analytics SQL output based on the nature of your destination table/workload. 批大小是随每个批量插入事务一起发送的最大记录数。Batch size is the maximum number of records that sent with every bulk insert transaction. 在聚集列存储索引中,批大小约为 100K,这可以实现并行化、极简的日志记录和锁定优化。In clustered columnstore indexes, batch sizes around 100K allow for more parallelization, minimal logging, and locking optimizations. 在基于磁盘的表中,10K(默认值)或更小的值可能最适合解决方案,因为较大的批大小可能在批量插入期间触发锁升级。In disk-based tables, 10K (default) or lower may be optimal for your solution, as higher batch sizes may trigger lock escalation during bulk inserts.

  • 输入消息优化 - 如果已使用继承分区和批大小进行优化,则增大每个分区的每个消息的输入事件数有助于进一步提高写入吞吐量。Input Message Tuning - If you've optimized using inherit partitioning and batch size, increasing the number of input events per message per partition helps further pushing up your write throughput. 通过输入消息优化,可将 Azure 流分析中的批大小最大提高到指定的批大小,从而提高吞吐量。Input message tuning allows batch sizes within Azure Stream Analytics to be up to the specified Batch Size, thereby improving throughput. 这可以通过使用压缩或在 EventHub 或 Blob 中增加输入消息大小来实现。This can be achieved by using compression or increasing input message sizes in EventHub or Blob.

SQL AzureSQL Azure

  • 分区表和索引 - 在包含与分区键(例如 PartitionId)相同的列的表中使用 分区 SQL 表和分区索引可以在写入期间明显减少分区之间的争用。Partitioned Table and Indexes - Using a partitioned SQL table and partitioned indexes on the table with the same column as your partition key (for example, PartitionId) can significantly reduce contentions among partitions during writes. 对于分区表,需要在 PRIMARY 文件组中创建分区函数分区方案For a partitioned table, you'll need to create a partition function and a partition scheme on the PRIMARY filegroup. 这也可以在加载新数据时提高现有数据的可用性。This will also increase availability of existing data while new data is being loaded. 根据分区的数量,可能会达到日志 IO 限制;升级 SKU 可以提高限制。Log IO limit may be hit based on number of partitions, which can be increased by upgrading the SKU.

  • 避免唯一键冲突 - 如果 Azure 流分析活动日志中出现 多个键冲突警告消息,请确保作业不受唯一约束冲突(在恢复期间可能会发生)的影响。Avoid unique key violations - If you get multiple key violation warning messages in the Azure Stream Analytics Activity Log, ensure your job isn't impacted by unique constraint violations which are likely to happen during recovery cases. 可以通过在索引中设置 IGNORE_DUP_KEY 选项来避免此问题。This can be avoided by setting the IGNORE_DUP_KEY option on your indexes.

Azure 数据工厂和内存中表Azure Data Factory and In-Memory Tables

  • 用作临时表的内存中表 - 使用 内存中表可以大大提高数据加载速度,但内存必须能够装得下这些数据。In-Memory Table as temp table - In-Memory tables allow for very high speed data loads but data needs to fit in memory. 基准测试表明,从内存中表批量加载到基于磁盘的表,比使用单个写入器直接批量插入到包含标识列和聚集索引的基于磁盘的表的速度大约要快 10 倍。Benchmarks show bulk loading from an in-memory table to a disk-based table is about 10 times faster than directly bulk inserting using a single writer into the disk-based table with an identity column and a clustered index. 若要利用这种批量插入性能,请设置一个使用 Azure 数据工厂的复制作业,用于将数据从内存中表复制到基于磁盘的表。To leverage this bulk insert performance, set up a copy job using Azure Data Factory that copies data from the in-memory table to the disk-based table.

避免性能陷阱Avoiding Performance Pitfalls

批量插入数据比通过单次插入加载数据的速度要快得多,因为避免了传输数据、分析 insert 语句、运行该语句以及发出事务记录的重复开销。Bulk inserting data is much faster than loading data with single inserts because the repeated overhead of transferring the data, parsing the insert statement, running the statement, and issuing a transaction record is avoided. 因而在存储引擎中使用更高效的路径流式传输数据。Instead, a more efficient path is used into the storage engine to stream the data. 但是,此路径的设置成本比基于磁盘的表中的单个 insert 语句的成本要高得多。The setup cost of this path is however much higher than a single insert statement in a disk-based table. 保本点通常约为 100 行,如果超过此数量,批量加载几乎总是更高效。The break-even point is typically around 100 rows, beyond which bulk loading is almost always more efficient.

如果传入事件的速率较低,则很可能创建小于 100 行的批大小,这会使批量插入效率低下,并占用过多磁盘空间。If the incoming events rate is low, it can easily create batch sizes lower than 100 rows, which makes bulk insert inefficient and uses too much disk space. 若要消除此限制,可以执行以下操作之一:To work around this limitation, you can do one of these actions:

  • 创建 INSTEAD OF 触发器,以对每一行使用简单插入。Create an INSTEAD OF trigger to use simple insert for every row.
  • 使用上一部分所述的内存中临时表。Use an In-Memory temp table as described in the previous section.

写入非群集列存储索引 (NCCI) 时,会发生另一种此类情况,其中较小的批量插入会创建过多的段,从而使索引崩溃。Another such scenario occurs when writing into a non-clustered columnstore index (NCCI), where smaller bulk inserts can create too many segments, that can crash the index. 在这种情况下,建议改为使用群集列存储索引。In this case, the recommendation is to use a Clustered Columnstore index instead.


概括而言,使用 Azure 流分析中适用于 SQL 输出的、使作业并行化与 SQL Azure 中分区表相符的分区输出功能,应该可以明显改善吞吐量。In summary, with the partitioned output feature in Azure Stream Analytics for SQL output, aligned parallelization of your job with a partitioned table in SQL Azure should give you significant throughput improvements. 利用 Azure 数据工厂来协调从内存中表到基于磁盘的表的数据移动,可让吞吐量获得指数级的提升。Leveraging Azure Data Factory for orchestrating data movement from an In-Memory table into Disk-based tables can give order of magnitude throughput gains. 如果可行,改善消息密度也可能是改善总体吞吐量的主要因素。If feasible, improving message density can also be a major factor in improving overall throughput.