跨具有不同架构的云数据库进行查询(预览版)Query across cloud databases with different schemas (preview)

跨不同数据库中的表进行查询

垂直分区的数据库在不同的数据库中使用不同的表集。Vertically-partitioned databases use different sets of tables on different databases. 这意味着不同数据库上的架构是不同的。That means that the schema is different on different databases. 例如,清单的所有表都位于一个数据库上,而与会计相关的所有表都位于第二个数据库上。For instance, all tables for inventory are on one database while all accounting-related tables are on a second database.

先决条件Prerequisites

  • 用户必须拥有 ALTER ANY EXTERNAL DATA SOURCE 权限。The user must possess ALTER ANY EXTERNAL DATA SOURCE permission. 此权限包含在 ALTER DATABASE 权限中。This permission is included with the ALTER DATABASE permission.
  • 引用基础数据源需要 ALTER ANY EXTERNAL DATA SOURCE 权限。ALTER ANY EXTERNAL DATA SOURCE permissions are needed to refer to the underlying data source.

概述Overview

备注

与水平分区不同,这些 DDL 语句并不依赖于通过弹性数据库客户端库定义包含分片映射的数据层。Unlike with horizontal partitioning, these DDL statements do not depend on defining a data tier with a shard map through the elastic database client library.

  1. CREATE MASTER KEYCREATE MASTER KEY
  2. CREATE DATABASE SCOPED CREDENTIALCREATE DATABASE SCOPED CREDENTIAL
  3. CREATE EXTERNAL DATA SOURCE(创建外部数据源)CREATE EXTERNAL DATA SOURCE
  4. CREATE EXTERNAL TABLECREATE EXTERNAL TABLE

创建数据库范围的主密钥和凭据Create database scoped master key and credentials

弹性查询使用此凭据连接到远程数据库。The credential is used by the elastic query to connect to your remote databases.

CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'master_key_password';
CREATE DATABASE SCOPED CREDENTIAL <credential_name>  WITH IDENTITY = '<username>',  
SECRET = '<password>'
[;]

备注

确保 <username> 不包含任何“@servername”后缀。Ensure that the <username> does not include any "@servername" suffix.

创建外部数据源Create external data sources

语法:Syntax:

<External_Data_Source> ::=
CREATE EXTERNAL DATA SOURCE <data_source_name> WITH 
           (TYPE = RDBMS,
            LOCATION = '<fully_qualified_server_name>',
            DATABASE_NAME = '<remote_database_name>',  
            CREDENTIAL = <credential_name> 
            ) [;] 

重要

TYPE 参数必须设置为 RDBMSThe TYPE parameter must be set to RDBMS.

示例Example

以下示例说明了如何使用 CREATE 语句创建外部数据源。The following example illustrates the use of the CREATE statement for external data sources.

CREATE EXTERNAL DATA SOURCE RemoteReferenceData 
WITH 
( 
    TYPE=RDBMS, 
    LOCATION='myserver.database.chinacloudapi.cn', 
    DATABASE_NAME='ReferenceData', 
    CREDENTIAL= SqlUser 
); 

若要检索当前外部数据源的列表:To retrieve the list of current external data sources:

select * from sys.external_data_sources; 

外部表External Tables

语法:Syntax:

CREATE EXTERNAL TABLE [ database_name . [ schema_name ] . | schema_name . ] table_name  
( { <column_definition> } [ ,...n ])     
{ WITH ( <rdbms_external_table_options> ) } 
)[;] 

<rdbms_external_table_options> ::= 
  DATA_SOURCE = <External_Data_Source>, 
  [ SCHEMA_NAME = N'nonescaped_schema_name',] 
  [ OBJECT_NAME = N'nonescaped_object_name',] 

示例Example

    CREATE EXTERNAL TABLE [dbo].[customer]( 
        [c_id] int NOT NULL, 
        [c_firstname] nvarchar(256) NULL, 
        [c_lastname] nvarchar(256) NOT NULL, 
        [street] nvarchar(256) NOT NULL, 
        [city] nvarchar(256) NOT NULL, 
        [state] nvarchar(20) NULL, 
        [country] nvarchar(50) NOT NULL, 
    ) 
    WITH 
    ( 
           DATA_SOURCE = RemoteReferenceData 
    ); 

以下示例演示如何从当前数据库中检索外部表的列表:The following example shows how to retrieve the list of external tables from the current database:

select * from sys.external_tables; 

备注Remarks

弹性查询将扩展现有的外部表语法以定义使用 RDBMS 类型的外部数据源的外部表。Elastic query extends the existing external table syntax to define external tables that use external data sources of type RDBMS. 垂直分区的外部表定义涉及以下几个方面:An external table definition for vertical partitioning covers the following aspects:

  • 架构:外部表 DDL 定义了查询可以使用的架构。Schema: The external table DDL defines a schema that your queries can use. 外部表定义中提供的架构需要与存储实际数据的远程数据库中的表的架构相匹配。The schema provided in your external table definition needs to match the schema of the tables in the remote database where the actual data is stored.
  • 远程数据库引用:外部表 DDL 引用外部数据源。Remote database reference: The external table DDL refers to an external data source. 外部数据源指定存储实际表数据的远程数据库的 SQL 数据库服务器名称和数据库名称。The external data source specifies the SQL Database server name and database name of the remote database where the actual table data is stored.

使用上一节中所述的外部数据源时,用于创建外部表的语法如下:Using an external data source as outlined in the previous section, the syntax to create external tables is as follows:

DATA_SOURCE 子句定义用于外部表的外部数据源(即,在垂直分区情况下的远程数据库)。The DATA_SOURCE clause defines the external data source (i.e. the remote database in case of vertical partitioning) that is used for the external table.

使用 SCHEMA_NAME 和 OBJECT_NAME 子句可分别将外部表定义映射到远程数据库上不同架构中的表,或映射到具有不同名称的表。The SCHEMA_NAME and OBJECT_NAME clauses provide the ability to map the external table definition to a table in a different schema on the remote database, or to a table with a different name, respectively. 如果要在远程数据库上为目录视图或 DMV 定义外部表,或者在远程表名已在本地使用的任何其他情况下,这两个子句很有用。This is useful if you want to define an external table to a catalog view or DMV on your remote database - or any other situation where the remote table name is already taken locally.

以下 DDL 语句从本地目录中删除现有的外部表定义。The following DDL statement drops an existing external table definition from the local catalog. 它不会影响远程数据库。It does not impact the remote database.

DROP EXTERNAL TABLE [ [ schema_name ] . | schema_name. ] table_name[;]  

CREATE/DROP EXTERNAL TABLE 的权限:外部表 DDL 需要 ALTER ANY EXTERNAL DATA SOURCE 权限,在引用基础数据源时也需要该权限。Permissions for CREATE/DROP EXTERNAL TABLE: ALTER ANY EXTERNAL DATA SOURCE permissions are needed for external table DDL which is also needed to refer to the underlying data source.

安全注意事项Security considerations

有权访问外部表的用户在使用外部数据源定义中提供的凭据时自动获得对基础远程表的访问权。Users with access to the external table automatically gain access to the underlying remote tables under the credential given in the external data source definition. 应小心管理对外部表的访问权以避免通过外部数据源的凭据意外地提升权限。You should carefully manage access to the external table in order to avoid undesired elevation of privileges through the credential of the external data source. 可以使用常规 SQL 权限来授予或撤消对外部表的访问权,就像它是常规表一样。Regular SQL permissions can be used to GRANT or REVOKE access to an external table just as though it were a regular table.

示例:正在查询垂直分区数据库Example: querying vertically partitioned databases

以下查询执行订单和订单行的两个本地表和客户的远程表之间的三向联接。The following query performs a three-way join between the two local tables for orders and order lines and the remote table for customers. 这是弹性查询的引用数据用例的示例:This is an example of the reference data use case for elastic query:

    SELECT      
     c_id as customer,
     c_lastname as customer_name,
     count(*) as cnt_orderline, 
     max(ol_quantity) as max_quantity,
     avg(ol_amount) as avg_amount,
     min(ol_delivery_d) as min_deliv_date
    FROM customer 
    JOIN orders 
    ON c_id = o_c_id
    JOIN  order_line 
    ON o_id = ol_o_id and o_c_id = ol_c_id
    WHERE c_id = 100

用于远程 T-SQL 执行的存储过程:sp_execute_remoteStored procedure for remote T-SQL execution: sp_execute_remote

弹性查询还引入了一个存储过程,以便提供对远程数据库的直接访问。Elastic query also introduces a stored procedure that provides direct access to the remote database. 该存储过程名为 sp_execute _remote,可用于在远程数据库上执行远程存储过程或 T-SQL 代码。The stored procedure is called sp_execute _remote and can be used to execute remote stored procedures or T-SQL code on the remote database. 它采用了以下参数:It takes the following parameters:

  • 数据源名称 (nvarchar):RDBMS 类型的外部数据源名称。Data source name (nvarchar): The name of the external data source of type RDBMS.
  • 查询 (nvarchar):要在远程数据库上执行的 T-SQL 查询。Query (nvarchar): The T-SQL query to be executed on the remote database.
  • 参数声明 (nvarchar) - 可选:在查询参数(如 sp_executesql)中使用的参数的字符串(包含数据类型定义)。Parameter declaration (nvarchar) - optional: String with data type definitions for the parameters used in the Query parameter (like sp_executesql).
  • 参数值列表 - 可选:以逗号分隔的参数值(如 sp_executesql)的列表。Parameter value list - optional: Comma-separated list of parameter values (like sp_executesql).

sp_execute_remote 使用调用参数中提供的外部数据源在远程数据库上执行给定的 T-SQL 语句。The sp_execute_remote uses the external data source provided in the invocation parameters to execute the given T-SQL statement on the remote database. 它使用外部数据源的凭据连接到远程数据库。It uses the credential of the external data source to connect to the remote database.

示例:Example:

    EXEC sp_execute_remote
        N'MyExtSrc',
        N'select count(w_id) as foo from warehouse' 

工具的连接Connectivity for tools

可以使用常规 SQL Server 连接字符串将 BI 和数据集成工具连接到 SQL 数据库服务器上已启用弹性查询并已定义外部表的数据库。You can use regular SQL Server connection strings to connect your BI and data integration tools to databases on the SQL DB server that has elastic query enabled and external tables defined. 请确保支持将 SQL Server 用作工具的数据源。Make sure that SQL Server is supported as a data source for your tool. 然后可以引用弹性查询数据库及其外部表,就如同使用工具连接的任何其他 SQL Server 数据库一样。Then refer to the elastic query database and its external tables just like any other SQL Server database that you would connect to with your tool.

最佳实践Best practices

  • 确保已通过在 SQL 数据库防火墙配置中启用对 Azure 服务的访问授予弹性查询终结点数据库访问远程数据库的权限。Ensure that the elastic query endpoint database has been given access to the remote database by enabling access for Azure Services in its SQL DB firewall configuration. 另请确保外部数据源定义中提供的凭据可以成功登录到远程数据库并有权访问远程表。Also ensure that the credential provided in the external data source definition can successfully log into the remote database and has the permissions to access the remote table.
  • 弹性查询最适合大部分计算可以在远程数据库上完成的查询。Elastic query works best for queries where most of the computation can be done on the remote databases. 使用可以在远程数据库或联接上求值的选择性筛选器谓词(可以完全在远程数据库上执行),通常可以获得最佳查询性能。You typically get the best query performance with selective filter predicates that can be evaluated on the remote databases or joins that can be performed completely on the remote database. 其他查询模式可能需要从远程数据库加载大量数据并且可能会执行效果不佳。Other query patterns may need to load large amounts of data from the remote database and may perform poorly.

后续步骤Next steps