使用 SQL 数据仓库中的事务Using transactions in SQL Data Warehouse

有关在开发解决方案时实现 Azure SQL 数据仓库中的事务的技巧。Tips for implementing transactions in Azure SQL Data Warehouse for developing solutions.

期望What to expect

如你所料,SQL 数据仓库支持支持事务作为数据仓库工作负荷的一部分。As you would expect, SQL Data Warehouse supports transactions as part of the data warehouse workload. 但是,为了确保 SQL 数据仓库的性能维持在一定的程度,相比于 SQL Server,其某些功能会受到限制。However, to ensure the performance of SQL Data Warehouse is maintained at scale some features are limited when compared to SQL Server. 本文突出两者的差异,并列出其他信息。This article highlights the differences and lists the others.

事务隔离级别Transaction isolation levels

SQL 数据仓库实现 ACID 事务。SQL Data Warehouse implements ACID transactions. 但是,事务支持的隔离级别受限于 READ UNCOMMITTED;此级别不能更改。However, the isolation level of the transactional support is limited to READ UNCOMMITTED; this level cannot be changed. 如果考虑 READ UNCOMMITTED,可以实现许多编码方法,以避免脏读数据。If READ UNCOMMITTED is a concern, you can implement a number of coding methods to prevent dirty reads of data. 大多数流行方法使用 CTAS 和表分区切换(通常称为滑动窗口模式),以防止用户查询仍正准备的数据。The most popular methods use both CTAS and table partition switching (often known as the sliding window pattern) to prevent users from querying data that is still being prepared. 预先筛选数据的视图也是常用的方法。Views that pre-filter the data are also a popular approach.

事务大小Transaction size

单个数据修改事务有大小限制。A single data modification transaction is limited in size. 限制按每个分发进行应用。The limit is applied per distribution. 因此,通过将限制乘以分发数,可得总分配额。Therefore, the total allocation can be calculated by multiplying the limit by the distribution count. 要预计事务中的最大行数,请将分发上限除以每一行的总大小。To approximate the maximum number of rows in the transaction divide the distribution cap by the total size of each row. 对于可变长度列,考虑采用平均的列长度而不使用最大大小。For variable length columns, consider taking an average column length rather than using the maximum size.

下表中进行了以下假设:In the table below the following assumptions have been made:

  • 出现平均数据分布An even distribution of data has occurred
  • 平均行长度为 250 个字节The average row length is 250 bytes

Gen2Gen2

DWUDWU 每个分布的上限 (GB)Cap per distribution (GB) 分布的数量Number of Distributions 最大事务大小 (GB)MAX transaction size (GB) 每个分布的行数# Rows per distribution 每个事务的最大行数Max Rows per transaction
DW100cDW100c 11 6060 6060 4,000,0004,000,000 240,000,000240,000,000
DW200cDW200c 1.51.5 6060 9090 6,000,0006,000,000 360,000,000360,000,000
DW300cDW300c 2.252.25 6060 135135 9,000,0009,000,000 540,000,000540,000,000
DW400cDW400c 33 6060 180180 12,000,00012,000,000 720,000,000720,000,000
DW500cDW500c 3.753.75 6060 225225 15,000,00015,000,000 900,000,000900,000,000
DW1000cDW1000c 7.57.5 6060 450450 30,000,00030,000,000 1,800,000,0001,800,000,000
DW1500cDW1500c 11.2511.25 6060 675675 45,000,00045,000,000 2,700,000,0002,700,000,000
DW2000cDW2000c 1515 6060 900900 60,000,00060,000,000 3,600,000,0003,600,000,000
DW2500cDW2500c 18.7518.75 6060 11251125 75,000,00075,000,000 4,500,000,0004,500,000,000
DW3000cDW3000c 22.522.5 6060 1,3501,350 90,000,00090,000,000 5,400,000,0005,400,000,000
DW5000cDW5000c 37.537.5 6060 2,2502,250 150,000,000150,000,000 9,000,000,0009,000,000,000
DW6000cDW6000c 4545 6060 2,7002,700 180,000,000180,000,000 10,800,000,00010,800,000,000
DW7500cDW7500c 56.2556.25 6060 3,3753,375 225,000,000225,000,000 13,500,000,00013,500,000,000
DW10000cDW10000c 7575 6060 4,5004,500 300,000,000300,000,000 18,000,000,00018,000,000,000
DW15000cDW15000c 112.5112.5 6060 6,7506,750 450,000,000450,000,000 27,000,000,00027,000,000,000
DW30000cDW30000c 225225 6060 13,50013,500 900,000,000900,000,000 54,000,000,00054,000,000,000

Gen1Gen1

DWUDWU 每个分布的上限 (GB)Cap per distribution (GB) 分布的数量Number of Distributions 最大事务大小 (GB)MAX transaction size (GB) 每个分布的行数# Rows per distribution 每个事务的最大行数Max Rows per transaction
DW100DW100 11 6060 6060 4,000,0004,000,000 240,000,000240,000,000
DW200DW200 1.51.5 6060 9090 6,000,0006,000,000 360,000,000360,000,000
DW300DW300 2.252.25 6060 135135 9,000,0009,000,000 540,000,000540,000,000
DW400DW400 33 6060 180180 12,000,00012,000,000 720,000,000720,000,000
DW500DW500 3.753.75 6060 225225 15,000,00015,000,000 900,000,000900,000,000
DW600DW600 4.54.5 6060 270270 18,000,00018,000,000 1,080,000,0001,080,000,000
DW1000DW1000 7.57.5 6060 450450 30,000,00030,000,000 1,800,000,0001,800,000,000
DW1200DW1200 99 6060 540540 36,000,00036,000,000 2,160,000,0002,160,000,000
DW1500DW1500 11.2511.25 6060 675675 45,000,00045,000,000 2,700,000,0002,700,000,000
DW2000DW2000 1515 6060 900900 60,000,00060,000,000 3,600,000,0003,600,000,000
DW3000DW3000 22.522.5 6060 1,3501,350 90,000,00090,000,000 5,400,000,0005,400,000,000
DW6000DW6000 4545 6060 2,7002,700 180,000,000180,000,000 10,800,000,00010,800,000,000

事务大小限制按每个事务或操作进行应用。The transaction size limit is applied per transaction or operation. 不会跨所有当前事务进行应用。It is not applied across all concurrent transactions. 因此,允许每个事务向日志写入此数量的数据。Therefore each transaction is permitted to write this amount of data to the log.

为优化和最大程度减少写入到日志中的数据量,请参阅事务最佳做法一文。To optimize and minimize the amount of data written to the log, please refer to the Transactions best practices article.

Warning

最大事务大小仅可在哈希或者 ROUND_ROBIN 分布式表(其中数据均匀分布)中实现。The maximum transaction size can only be achieved for HASH or ROUND_ROBIN distributed tables where the spread of the data is even. 如果事务以偏斜方式向分布写入数据,那么更有可能在达到最大事务大小之前达到该限制。If the transaction is writing data in a skewed fashion to the distributions then the limit is likely to be reached prior to the maximum transaction size.

事务状态Transaction state

SQL 数据仓库使用 XACT_STATE() 函数(采用值 -2)来报告失败的事务。SQL Data Warehouse uses the XACT_STATE() function to report a failed transaction using the value -2. 此值表示事务已失败并标记为仅可回滚。This value means the transaction has failed and is marked for rollback only.

Note

XACT_STATE 函数使用 -2 表示失败的事务,以代表 SQL Server 中不同的行为。The use of -2 by the XACT_STATE function to denote a failed transaction represents different behavior to SQL Server. SQL Server 使用值 -1 来代表无法提交的事务。SQL Server uses the value -1 to represent an uncommittable transaction. SQL Server 可以容忍事务内的某些错误,而无需将其标记为无法提交。SQL Server can tolerate some errors inside a transaction without it having to be marked as uncommittable. 例如,SELECT 1/0 导致错误,但不强制事务进入无法提交状态。For example SELECT 1/0 would cause an error but not force a transaction into an uncommittable state. SQL Server 还允许读取无法提交的事务。SQL Server also permits reads in the uncommittable transaction. 但是,SQL 数据仓库不允许执行此操作。However, SQL Data Warehouse does not let you do this. 如果 SQL 数据仓库事务内部发生错误,它自动进入 -2 状态,并且在该语句回退之前,您无法执行任何 Select 语句。If an error occurs inside a SQL Data Warehouse transaction it will automatically enter the -2 state and you will not be able to make any further select statements until the statement has been rolled back. 因此,必须查看应用程序代码是否使用 XACT_STATE(),你可能需要修改代码。It is therefore important to check that your application code to see if it uses XACT_STATE() as you may need to make code modifications.

例如,在 SQL Server 中,可能会看到如下所示的事务:For example, in SQL Server you might see a transaction that looks like the following:

SET NOCOUNT ON;
DECLARE @xact_state smallint = 0;

BEGIN TRAN
    BEGIN TRY
        DECLARE @i INT;
        SET     @i = CONVERT(INT,'ABC');
    END TRY
    BEGIN CATCH
        SET @xact_state = XACT_STATE();

        SELECT  ERROR_NUMBER()    AS ErrNumber
        ,       ERROR_SEVERITY()  AS ErrSeverity
        ,       ERROR_STATE()     AS ErrState
        ,       ERROR_PROCEDURE() AS ErrProcedure
        ,       ERROR_MESSAGE()   AS ErrMessage
        ;

        IF @@TRANCOUNT > 0
        BEGIN
            ROLLBACK TRAN;
            PRINT 'ROLLBACK';
        END

    END CATCH;

IF @@TRANCOUNT >0
BEGIN
    PRINT 'COMMIT';
    COMMIT TRAN;
END

SELECT @xact_state AS TransactionState;

前面的代码提供以下错误消息:The preceding code gives the following error message:

Msg 111233, Level 16, State 1, Line 1 111233;当前事务已中止,所有挂起的更改都已回退。Msg 111233, Level 16, State 1, Line 1 111233; The current transaction has aborted, and any pending changes have been rolled back. 原因:仅回退状态的事务未在 DDL、DML 或 SELECT 语句之前显式回退。Cause: A transaction in a rollback-only state was not explicitly rolled back before a DDL, DML, or SELECT statement.

不会获得 ERROR_* 函数的输出值。You won't get the output of the ERROR_* functions.

在 SQL 数据仓库中,该代码需要稍做更改:In SQL Data Warehouse the code needs to be slightly altered:

SET NOCOUNT ON;
DECLARE @xact_state smallint = 0;

BEGIN TRAN
    BEGIN TRY
        DECLARE @i INT;
        SET     @i = CONVERT(INT,'ABC');
    END TRY
    BEGIN CATCH
        SET @xact_state = XACT_STATE();

        IF @@TRANCOUNT > 0
        BEGIN
            PRINT 'ROLLBACK';
            ROLLBACK TRAN;
        END

        SELECT  ERROR_NUMBER()    AS ErrNumber
        ,       ERROR_SEVERITY()  AS ErrSeverity
        ,       ERROR_STATE()     AS ErrState
        ,       ERROR_PROCEDURE() AS ErrProcedure
        ,       ERROR_MESSAGE()   AS ErrMessage
        ;
    END CATCH;

IF @@TRANCOUNT >0
BEGIN
    PRINT 'COMMIT';
    COMMIT TRAN;
END

SELECT @xact_state AS TransactionState;

现在观察到了预期行为。The expected behavior is now observed. 事务中的错误得到了管理,并且 ERROR_* 函数提供了预期值。The error in the transaction is managed and the ERROR_* functions provide values as expected.

所做的一切改变是事务的 ROLLBACK 必须发生于在 CATCH 块中读取错误信息之前。All that has changed is that the ROLLBACK of the transaction had to happen before the read of the error information in the CATCH block.

Error_Line() 函数Error_Line() function

另外值得注意的是,SQL 数据仓库不实现或支持 ERROR_LINE() 函数。It is also worth noting that SQL Data Warehouse does not implement or support the ERROR_LINE() function. 如果代码中包含此函数,需要将它删除才能符合 SQL 数据仓库的要求。If you have this in your code, you need to remove it to be compliant with SQL Data Warehouse. 请在代码中使用查询标签,而不是实现等效的功能。Use query labels in your code instead to implement equivalent functionality. 有关详细信息,请参阅 LABEL 一文。For more details, see the LABEL article.

使用 THROW 和 RAISERRORUsing THROW and RAISERROR

THROW 是在 SQL 数据仓库中引发异常的新式做法,但也支持 RAISERROR。THROW is the more modern implementation for raising exceptions in SQL Data Warehouse but RAISERROR is also supported. 不过,有些值得注意的差异。There are a few differences that are worth paying attention to however.

  • 对于 THROW,用户定义的错误消息数目不能在 100,000 - 150,000 的范围内User-defined error messages numbers cannot be in the 100,000 - 150,000 range for THROW
  • RAISERROR 错误消息固定为 50,000RAISERROR error messages are fixed at 50,000
  • 不支持 sys.messagesUse of sys.messages is not supported

限制Limitations

SQL 数据仓库有一些与事务相关的其他限制。SQL Data Warehouse does have a few other restrictions that relate to transactions.

这些限制如下:They are as follows:

  • 无分布式事务No distributed transactions
  • 不允许嵌套事务No nested transactions permitted
  • 不允许保存点No save points allowed
  • 无已命名事务No named transactions
  • 无已标记事务No marked transactions
  • 不支持 DDL,例如用户定义的事务内的 CREATE TABLENo support for DDL such as CREATE TABLE inside a user-defined transaction

后续步骤Next steps

若要了解有关优化事务的详细信息,请参阅事务最佳做法To learn more about optimizing transactions, see Transactions best practices. 若要了解有关其他 SQL 数据仓库最佳做法的信息,请参阅 SQL 数据仓库最佳做法To learn about other SQL Data Warehouse best practices, see SQL Data Warehouse best practices.