SQL 数据仓库中的临时表Temporary tables in SQL Data Warehouse

本文包含使用临时表的基本指导,并重点介绍会话级别临时表的原则。This article contains essential guidance for using temporary tables and highlights the principles of session level temporary tables. 使用本文中的信息可以帮助将代码模块化,从而同时提高代码的可重用性和易维护性。Using the information in this article can help you modularize your code, improving both reusability and ease of maintenance of your code.

什么是临时表?What are temporary tables?

临时表在处理数据时非常有用 - 尤其是在具有暂时性中间结果的转换期间。Temporary tables are useful when processing data - especially during transformation where the intermediate results are transient. 临时表位于 SQL 数据仓库中的会话级别。In SQL Data Warehouse, temporary tables exist at the session level. 它们仅对创建它们的会话可见,并在该会话注销时被自动删除。They are only visible to the session in which they were created and are automatically dropped when that session logs off. 临时表可以提高性能,因为其结果将写入到本地而不是远程存储。Temporary tables offer a performance benefit because their results are written to local rather than remote storage.

创建临时表Create a temporary table

只需为表名添加 # 前缀,即可创建临时表。Temporary tables are created by prefixing your table name with a #. 例如:For example:

CREATE TABLE #stats_ddl
(
    [schema_name]        NVARCHAR(128) NOT NULL
,    [table_name]            NVARCHAR(128) NOT NULL
,    [stats_name]            NVARCHAR(128) NOT NULL
,    [stats_is_filtered]     BIT           NOT NULL
,    [seq_nmbr]              BIGINT        NOT NULL
,    [two_part_name]         NVARCHAR(260) NOT NULL
,    [three_part_name]       NVARCHAR(400) NOT NULL
)
WITH
(
    DISTRIBUTION = HASH([seq_nmbr])
,    HEAP
)

此外可以使用 CTAS 通过完全相同的方法来创建临时表:Temporary tables can also be created with a CTAS using exactly the same approach:

CREATE TABLE #stats_ddl
WITH
(
    DISTRIBUTION = HASH([seq_nmbr])
,    HEAP
)
AS
(
SELECT
        sm.[name]                                                                AS [schema_name]
,        tb.[name]                                                                AS [table_name]
,        st.[name]                                                                AS [stats_name]
,        st.[has_filter]                                                            AS [stats_is_filtered]
,       ROW_NUMBER()
        OVER(ORDER BY (SELECT NULL))                                            AS [seq_nmbr]
,                                 QUOTENAME(sm.[name])+'.'+QUOTENAME(tb.[name])  AS [two_part_name]
,        QUOTENAME(DB_NAME())+'.'+QUOTENAME(sm.[name])+'.'+QUOTENAME(tb.[name])  AS [three_part_name]
FROM    sys.objects            AS ob
JOIN    sys.stats            AS st    ON    ob.[object_id]        = st.[object_id]
JOIN    sys.stats_columns    AS sc    ON    st.[stats_id]        = sc.[stats_id]
                                    AND st.[object_id]        = sc.[object_id]
JOIN    sys.columns            AS co    ON    sc.[column_id]        = co.[column_id]
                                    AND    sc.[object_id]        = co.[object_id]
JOIN    sys.tables            AS tb    ON    co.[object_id]        = tb.[object_id]
JOIN    sys.schemas            AS sm    ON    tb.[schema_id]        = sm.[schema_id]
WHERE    1=1
AND        st.[user_created]   = 1
GROUP BY
        sm.[name]
,        tb.[name]
,        st.[name]
,        st.[filter_definition]
,        st.[has_filter]
)
;

Note

CTAS 是一个强大的命令并具有附加优势,可有效利用事务日志空间。CTAS is a powerful command and has the added advantage of being efficient in its use of transaction log space.

删除临时表Dropping temporary tables

创建新会话时,应不存在任何临时表。When a new session is created, no temporary tables should exist. 但是,如果调用同一存储过程,它将使用相同名称创建临时表,要确保 CREATE TABLE 语句成功执行,可使用带 DROP 的简单预存在检查,如以下示例中所示:However, if you are calling the same stored procedure, which creates a temporary with the same name, to ensure that your CREATE TABLE statements are successful a simple pre-existence check with a DROP can be used as in the following example:

IF OBJECT_ID('tempdb..#stats_ddl') IS NOT NULL
BEGIN
    DROP TABLE #stats_ddl
END

为了实现编码一致性,好的做法是对表和临时表都使用此模式。For coding consistency, it is a good practice to use this pattern for both tables and temporary tables. 当在代码中完成临时表时,使用 DROP TABLE 来删除临时表也很不错。It is also a good idea to use DROP TABLE to remove temporary tables when you have finished with them in your code. 在存储过程开发中,通常将 drop 命令捆绑在过程末尾,以确保清除这些对象。In stored procedure development, it is common to see the drop commands bundled together at the end of a procedure to ensure these objects are cleaned up.

DROP TABLE #stats_ddl

模块化代码Modularizing code

由于可以在用户会话中的任何位置查看临时表,可以利用这一点帮助将应用程序代码模块化。Since temporary tables can be seen anywhere in a user session, this can be exploited to help you modularize your application code. 例如,以下存储过程生成 DDL,按统计信息名称更新数据库中的所有统计信息。For example, the following stored procedure generates DDL to update all statistics in the database by statistic name.

CREATE PROCEDURE    [dbo].[prc_sqldw_update_stats]
(   @update_type    tinyint -- 1 default 2 fullscan 3 sample 4 resample
    ,@sample_pct     tinyint
)
AS

IF @update_type NOT IN (1,2,3,4)
BEGIN;
    THROW 151000,'Invalid value for @update_type parameter. Valid range 1 (default), 2 (fullscan), 3 (sample) or 4 (resample).',1;
END;

IF @sample_pct IS NULL
BEGIN;
    SET @sample_pct = 20;
END;

IF OBJECT_ID('tempdb..#stats_ddl') IS NOT NULL
BEGIN
    DROP TABLE #stats_ddl
END

CREATE TABLE #stats_ddl
WITH
(
    DISTRIBUTION = HASH([seq_nmbr])
)
AS
(
SELECT
        sm.[name]                                                                AS [schema_name]
,        tb.[name]                                                                AS [table_name]
,        st.[name]                                                                AS [stats_name]
,        st.[has_filter]                                                            AS [stats_is_filtered]
,       ROW_NUMBER()
        OVER(ORDER BY (SELECT NULL))                                            AS [seq_nmbr]
,                                 QUOTENAME(sm.[name])+'.'+QUOTENAME(tb.[name])  AS [two_part_name]
,        QUOTENAME(DB_NAME())+'.'+QUOTENAME(sm.[name])+'.'+QUOTENAME(tb.[name])  AS [three_part_name]
FROM    sys.objects            AS ob
JOIN    sys.stats            AS st    ON    ob.[object_id]        = st.[object_id]
JOIN    sys.stats_columns    AS sc    ON    st.[stats_id]        = sc.[stats_id]
                                    AND st.[object_id]        = sc.[object_id]
JOIN    sys.columns            AS co    ON    sc.[column_id]        = co.[column_id]
                                    AND    sc.[object_id]        = co.[object_id]
JOIN    sys.tables            AS tb    ON    co.[object_id]        = tb.[object_id]
JOIN    sys.schemas            AS sm    ON    tb.[schema_id]        = sm.[schema_id]
WHERE    1=1
AND        st.[user_created]   = 1
GROUP BY
        sm.[name]
,        tb.[name]
,        st.[name]
,        st.[filter_definition]
,        st.[has_filter]
)
SELECT
    CASE @update_type
    WHEN 1
    THEN 'UPDATE STATISTICS '+[two_part_name]+'('+[stats_name]+');'
    WHEN 2
    THEN 'UPDATE STATISTICS '+[two_part_name]+'('+[stats_name]+') WITH FULLSCAN;'
    WHEN 3
    THEN 'UPDATE STATISTICS '+[two_part_name]+'('+[stats_name]+') WITH SAMPLE '+CAST(@sample_pct AS VARCHAR(20))+' PERCENT;'
    WHEN 4
    THEN 'UPDATE STATISTICS '+[two_part_name]+'('+[stats_name]+') WITH RESAMPLE;'
    END AS [update_stats_ddl]
,   [seq_nmbr]
FROM    t1
;
GO

在此阶段发生的唯一操作是创建存储过程,该存储过程使用 DDL 语句生成临时表 #stats_ddl。At this stage, the only action that has occurred is the creation of a stored procedure that generates a temporary table, #stats_ddl, with DDL statements. 如果此存储过程在会话中运行了不止一次,它会删除已存在的 #stats_ddl,以确保它不会失败。This stored procedure drops #stats_ddl if it already exists to ensure it does not fail if run more than once within a session. 但是,由于存储过程的末尾没有 DROP TABLE,当存储过程完成后,它将保留创建的表,以便能够在存储过程之外进行读取。However, since there is no DROP TABLE at the end of the stored procedure, when the stored procedure completes, it leaves the created table so that it can be read outside of the stored procedure. 在 SQL 数据仓库中,与其他 SQL Server 数据库不同,有可能在创建临时表的过程外部使用该临时表。In SQL Data Warehouse, unlike other SQL Server databases, it is possible to use the temporary table outside of the procedure that created it. 可以在会话中的 任何位置 使用 SQL 数据仓库临时表。SQL Data Warehouse temporary tables can be used anywhere inside the session. 这样可提高代码的模块化程度与易管理性,如以下示例所示:This can lead to more modular and manageable code as in the following example:

EXEC [dbo].[prc_sqldw_update_stats] @update_type = 1, @sample_pct = NULL;

DECLARE @i INT              = 1
,       @t INT              = (SELECT COUNT(*) FROM #stats_ddl)
,       @s NVARCHAR(4000)   = N''

WHILE @i <= @t
BEGIN
    SET @s=(SELECT update_stats_ddl FROM #stats_ddl WHERE seq_nmbr = @i);

    PRINT @s
    EXEC sp_executesql @s
    SET @i+=1;
END

DROP TABLE #stats_ddl;

临时表的限制Temporary table limitations

SQL 数据仓库在实现临时表时确实会施加一些限制。SQL Data Warehouse does impose a couple of limitations when implementing temporary tables. 目前,仅支持会话范围的临时表。Currently, only session scoped temporary tables are supported. 不支持全局临时表。Global Temporary Tables are not supported. 此外,不能在临时表上创建视图。In addition, views cannot be created on temporary tables. 只能使用哈希分布或轮循机制分布来创建临时表。Temporary tables can only be created with hash or round robin distribution. 不支持复制的临时表分布。Replicated temporary table distribution is not supported.

后续步骤Next steps

若要详细了解如何开发表,请参阅表概述To learn more about developing tables, see the Table Overview.