Azure SQL 数据库弹性查询概述(预览版)Azure SQL Database elastic query overview (preview)

适用于:是Azure SQL 数据库 APPLIES TO: yesAzure SQL Database

使用弹性查询功能(处于预览状态)可以跨 Azure SQL 数据库中的多个数据库运行 Transact-SQL 查询。The elastic query feature (in preview) enables you to run a Transact-SQL query that spans multiple databases in Azure SQL Database. 它允许执行跨数据库查询以访问远程表,以及连接 Microsoft 和第三方工具(Excel、Power BI、Tableau 等)以跨多个数据库的数据层进行查询。It allows you to perform cross-database queries to access remote tables, and to connect Microsoft and third-party tools (Excel, Power BI, Tableau, etc.) to query across data tiers with multiple databases. 使用此功能,可以将查询横向扩展到较大数据层,并直观显示商业智能 (BI) 报表中的结果。Using this feature, you can scale out queries to large data tiers and visualize the results in business intelligence (BI) reports.

为什么要使用弹性查询?Why use elastic queries

Azure SQL 数据库Azure SQL Database

跨 Azure SQL 数据库中的数据库进行查询的操作完全在 T-SQL 中执行。Query across databases in Azure SQL Database completely in T-SQL. 这允许对远程数据库进行只读查询,并为当前的 SQL Server 客户提供将使用由三部分和四部分组成的名称或链接服务器的应用程序迁移到 SQL 数据库的选项。This allows for read-only querying of remote databases and provides an option for current SQL Server customers to migrate applications using three- and four-part names or linked server to SQL Database.

在标准层上可用Available on standard tier

“标准”和“高级”服务层级均支持弹性查询。Elastic query is supported on both the Standard and Premium service tiers. 有关较低服务层级的性能限制,请参阅下面有关预览版限制的部分。See the section on Preview Limitations below on performance limitations for lower service tiers.

将参数推送到远程数据库Push parameters to remote databases

弹性查询现在可以将 SQL 参数推送到远程数据库执行。Elastic queries can now push SQL parameters to the remote databases for execution.

存储过程执行Stored procedure execution

使用 sp_execute _remote 执行远程存储过程调用或远程函数。Execute remote stored procedure calls or remote functions using sp_execute _remote.

灵活性Flexibility

使用弹性查询的外部表可以引用具有不同架构或表名的远程表。External tables with elastic query can refer to remote tables with a different schema or table name.

弹性查询方案Elastic query scenarios

目标是便于实现多个数据库参与生成单个总体结果中的行的查询方案。The goal is to facilitate querying scenarios where multiple databases contribute rows into a single overall result. 查询可以由用户或应用程序直接编写,也可以通过与数据库连接的工具间接编写。The query can either be composed by the user or application directly, or indirectly through tools that are connected to the database. 这在使用商业 BI 或数据集成工具或任何不能更改的应用程序创建报表时特别有用。This is especially useful when creating reports, using commercial BI or data integration tools, or any application that cannot be changed. 使用弹性查询,可以利用 Excel、Power BI、Tableau 或 Cognos 等工具中熟悉的 SQL Server 连接体验跨多个数据库进行查询。With an elastic query, you can query across several databases using the familiar SQL Server connectivity experience in tools such as Excel, Power BI, Tableau, or Cognos. 弹性查询可让你通过 SQL Server Management Studio 或 Visual Studio 发出的查询轻松访问整个数据库集合,并便于实现从实体框架或其他 ORM 环境跨数据库查询。An elastic query allows easy access to an entire collection of databases through queries issued by SQL Server Management Studio or Visual Studio, and facilitates cross-database querying from Entity Framework or other ORM environments. 图 1 显示了这样一个方案:其中使用弹性数据库客户端库的现有云应用程序在扩大的数据层的基础上构建,并且弹性查询用于跨数据库报告。Figure 1 shows a scenario where an existing cloud application (which uses the elastic database client library) builds on a scaled-out data tier, and an elastic query is used for cross-database reporting.

图 1 在扩大的数据层上使用的弹性查询Figure 1 Elastic query used on scaled-out data tier

在扩大的数据层上使用的弹性查询

弹性查询的客户方案的特征包括以下拓扑:Customer scenarios for elastic query are characterized by the following topologies:

  • 垂直分区 - 跨数据库查询(拓扑 1):数据在数据层中的多个数据库之间垂直分区。Vertical partitioning - Cross-database queries (Topology 1): The data is partitioned vertically between a number of databases in a data tier. 通常,不同的表集驻留在不同的数据库上。Typically, different sets of tables reside on different databases. 这意味着不同数据库上的架构是不同的。That means that the schema is different on different databases. 例如,清单的所有表都位于一个数据库上,而与会计相关的所有表都位于第二个数据库上。For instance, all tables for inventory are on one database while all accounting-related tables are on a second database. 采用此拓扑的常见使用案例需要使用一个查询跨多个数据库中的表进行查询或编译报表。Common use cases with this topology require one to query across or to compile reports across tables in several databases.
  • 水平分区 - 分片(拓扑 2):将数据进行水平分区以将行分布到扩大的数据层上。Horizontal Partitioning - Sharding (Topology 2): Data is partitioned horizontally to distribute rows across a scaled out data tier. 使用此方法时,所有参与数据库中的架构是相同的。With this approach, the schema is identical on all participating databases. 此方法也称为“分片”。This approach is also called "sharding". 可以使用 (1) 弹性数据库客户端库或 (2) 自我分片来执行和管理分片。Sharding can be performed and managed using (1) the elastic database tools libraries or (2) self-sharding. 可使用弹性查询跨多个分片查询或编译报表。An elastic query is used to query or compile reports across many shards. 分片通常是弹性池中的数据库。Shards are typically databases within an elastic pool. 可将弹性查询视为一次性查询弹性池的所有数据库的高效方式,前提是数据库共享通用架构。You can think of elastic query as an efficient way for querying all databases of elastic pool at once, as long as databases share the common schema.

备注

弹性查询最适用于可以在外部源端执行大多数处理(筛选、聚合)的报告方案。Elastic query works best for reporting scenarios where most of the processing (filtering, aggregation) can be performed on the external source side. 它不适用于从远程数据库传输大量数据的 ETL 操作。It is not suitable for ETL operations where large amount of data is being transferred from remote database(s). 对于使用更复杂查询的大量报表工作负荷或数据仓库方案,还可以考虑使用 Azure Synapse AnalyticsFor heavy reporting workloads or data warehousing scenarios with more complex queries, also consider using Azure Synapse Analytics.

垂直分区 - 跨数据库查询Vertical partitioning - cross-database queries

若要开始编写代码,请参阅跨数据库查询(垂直分区)入门To begin coding, see Getting started with cross-database query (vertical partitioning).

弹性查询可用于使位于 SQL 数据库的数据库中的数据供 SQL 数据库中的其他数据库使用。An elastic query can be used to make data located in a database in SQL Database available to other databases in SQL Database. 这允许从一个数据库的查询引用 SQL 数据库中任何其他远程数据库中的表。This allows queries from one database to refer to tables in any other remote database in SQL Database. 第一步是定义每个远程数据库的外部数据源。The first step is to define an external data source for each remote database. 外部数据源在要从中获取对远程数据库中的表的访问权限的本地数据库中定义。The external data source is defined in the local database from which you want to gain access to tables located on the remote database. 在远程数据库上不必进行任何更改。No changes are necessary on the remote database. 对于不同数据库具有不同架构的典型垂直分区方案,可以使用弹性查询来实现常见使用案例(例如,访问引用数据和跨数据库查询)。For typical vertical partitioning scenarios where different databases have different schemas, elastic queries can be used to implement common use cases such as access to reference data and cross-database querying.

重要

必须拥有 ALTER ANY EXTERNAL DATA SOURCE 权限。You must possess ALTER ANY EXTERNAL DATA SOURCE permission. 此权限包含在 ALTER DATABASE 权限中。This permission is included with the ALTER DATABASE permission. 引用基础数据源需要 ALTER ANY EXTERNAL DATA SOURCE 权限。ALTER ANY EXTERNAL DATA SOURCE permissions are needed to refer to the underlying data source.

引用数据:拓扑用于引用数据管理。Reference data: The topology is used for reference data management. 在下图中,包含引用数据的两个表(T1 和 T2)保存在专用数据库中。In the figure below, two tables (T1 and T2) with reference data are kept on a dedicated database. 使用弹性查询,现在可以从其他数据库远程访问表 T1 和 T2,如图中所示。Using an elastic query, you can now access tables T1 and T2 remotely from other databases, as shown in the figure. 如果引用表较小或对引用表的远程查询包含选择性谓词,请使用拓扑 1。Use topology 1 if reference tables are small or remote queries into reference table have selective predicates.

图 2 垂直分区 - 使用弹性查询来查询引用数据Figure 2 Vertical partitioning - Using elastic query to query reference data

垂直分区 - 使用弹性查询来查询引用数据

跨数据库查询:弹性查询支持需要在 SQL 数据库中跨多个数据库进行查询的用例。Cross-database querying: Elastic queries enable use cases that require querying across several databases in SQL Database. 图 3 显示了四个不同的数据库:CRM、库存、人力资源和产品。Figure 3 shows four different databases: CRM, Inventory, HR, and Products. 在其中一个数据库中执行的查询还需要访问一个或所有其他数据库。Queries performed in one of the databases also need access to one or all the other databases. 使用弹性查询时,可以通过在这四个数据库的每个数据库上运行几个简单的 DDL 语句来为此案例配置数据库。Using an elastic query, you can configure your database for this case by running a few simple DDL statements on each of the four databases. 经过此一次性配置后,对远程表的访问就像从 T-SQL 查询或 BI 工具引用本地表一样简单。After this one-time configuration, access to a remote table is as simple as referring to a local table from your T-SQL queries or from your BI tools. 如果远程查询不返回大型结果,建议使用此方法。This approach is recommended if the remote queries do not return large results.

图 3 垂直分区 - 使用弹性查询来跨多个数据库查询Figure 3 Vertical partitioning - Using elastic query to query across various databases

垂直分区 - 使用弹性查询来跨多个数据库查询

通过以下步骤,为垂直分区方案(需要访问 SQL 数据库中具有相同架构的远程数据库中的表)配置弹性数据库查询:The following steps configure elastic database queries for vertical partitioning scenarios that require access to a table located on remote databases in SQL Database with the same schema:

运行 DDL 语句后,可以访问远程表“mytable”,就像它是本地表一样。After running the DDL statements, you can access the remote table "mytable" as though it were a local table. Azure SQL 数据库会自动打开与远程数据库的连接,处理远程数据库上的请求并返回结果。Azure SQL Database automatically opens a connection to the remote database, processes your request on the remote database, and returns the results.

水平分区 - 分片Horizontal partitioning - sharding

使用弹性查询对分片(即,水平分区)的数据层执行报表任务需要弹性数据库分片映射来表示数据层的数据库。Using elastic query to perform reporting tasks over a sharded, that is, horizontally partitioned, data tier requires an elastic database shard map to represent the databases of the data tier. 通常情况下,仅在此方案中使用单个分片映射,并且具有弹性查询功能的专用数据库(头节点)将作为报表查询的入口点。Typically, only a single shard map is used in this scenario and a dedicated database with elastic query capabilities (head node) serves as the entry point for reporting queries. 只有此专用数据库需要访问分片映射。Only this dedicated database needs access to the shard map. 图 4 说明了此拓扑及其使用弹性查询数据库和分片映射的配置。Figure 4 illustrates this topology and its configuration with the elastic query database and shard map. 有关弹性数据库客户端库和创建分片映射的详细信息,请参阅分片映射管理For more information about the elastic database client library and creating shard maps, see Shard map management.

图 4 水平分区 - 使用弹性查询实现分片数据层上的报告Figure 4 Horizontal partitioning - Using elastic query for reporting over sharded data tiers

水平分区 - 使用弹性查询实现分片数据层上的报告

备注

弹性查询数据库(头节点)可以是单独的数据库,也可以是承载分片映射的相同数据库。Elastic Query Database (head node) can be separate database, or it can be the same database that hosts the shard map. 无论选择何种配置,请确保该数据库的服务层级和计算大小足够高,以便处理预期的登录/查询请求数。Whatever configuration you choose, make sure that service tier and compute size of that database is high enough to handle the expected amount of login/query requests.

通过以下步骤,为水平分区方案(需要访问通常位于 SQL 数据库中多个远程数据库上的一组表)配置弹性数据库查询:The following steps configure elastic database queries for horizontal partitioning scenarios that require access to a set of tables located on (typically) several remote databases in SQL Database:

执行这些步骤后,便可以访问水平分区的表“mytable”,就像它是本地表一样。Once you have performed these steps, you can access the horizontally partitioned table "mytable" as though it were a local table. Azure SQL 数据库自动打开与远程数据库(在其中表以物理方式存储)的多个并行连接,处理远程数据库上的请求并返回结果。Azure SQL Database automatically opens multiple parallel connections to the remote databases where the tables are physically stored, processes the requests on the remote databases, and returns the results. 有关水平分区方案所需的步骤的详细信息,可以在水平分区的弹性查询中找到。More information on the steps required for the horizontal partitioning scenario can be found in elastic query for horizontal partitioning.

若要开始编写代码,请参阅弹性查询入门 - 水平分区(分片)To begin coding, see Getting started with elastic query for horizontal partitioning (sharding).

重要

能否对大量的数据库成功执行弹性查询,在很大程度上取决于执行查询期间每个数据库的可用性。Successful execution of elastic query over a large set of databases relies heavily on the availability of each of databases during the query execution. 如果其中的某个数据库不可用,整个查询将会失败。If one of databases is not available, entire query will fail. 如果你打算一次性查询数百甚至数千个数据库,请确保客户端应用程序中嵌入了重试逻辑,或者考虑利用弹性数据库作业(预览版)并查询一小部分数据库,然后将每个查询的结果合并到一个目标中。If you plan to query hundreds or thousands of databases at once, make sure your client application has retry logic embedded, or consider leveraging Elastic Database Jobs (preview) and querying smaller subsets of databases, consolidating results of each query into a single destination.

T-SQL 查询T-SQL querying

定义外部数据源和外部表后,便可以使用常规 SQL Server 连接字符串连接到在其中定义了外部表的数据库。Once you have defined your external data sources and your external tables, you can use regular SQL Server connection strings to connect to the databases where you defined your external tables. 然后可以通过该连接对外部表运行 T-SQL 语句,但具有下述限制。You can then run T-SQL statements over your external tables on that connection with the limitations outlined below. 可以在水平分区垂直分区的文档主题中找到 T-SQL 查询的详细信息和示例。You can find more information and examples of T-SQL queries in the documentation topics for horizontal partitioning and vertical partitioning.

工具的连接Connectivity for tools

可以使用常规 SQL Server 连接字符串将应用程序和 BI 或数据集成工具连接到具有外部表的数据库。You can use regular SQL Server connection strings to connect your applications and BI or data integration tools to databases that have external tables. 请确保支持将 SQL Server 用作工具的数据源。Make sure that SQL Server is supported as a data source for your tool. 连接后,可以引用弹性查询数据库和该数据库中的外部表,就像你会对使用工具连接的任何其他 SQL Server 数据库所做的一样。Once connected, refer to the elastic query database and the external tables in that database just like you would do with any other SQL Server database that you connect to with your tool.

重要

当前不支持使用 Azure Active Directory 通过弹性查询进行身份验证。Authentication using Azure Active Directory with elastic queries is not currently supported.

成本Cost

Azure SQL 数据库的成本中涵盖了弹性查询。Elastic query is included in the cost of Azure SQL Database. 请注意,支持远程数据库与弹性查询终结点在不同的数据中心的拓扑,但通常按 Azure 费率对远程数据库的数据流出量进行收费。Note that topologies where your remote databases are in a different data center than the elastic query endpoint are supported, but data egress from remote databases is charged regularly Azure rates.

预览版限制Preview limitations

  • 在“标准”服务层级上运行第一个弹性查询可能需要长达几分钟时间。Running your first elastic query can take up to a few minutes on the Standard service tier. 此时间是加载弹性查询功能所必需的;使用更高的服务层级和计算大小可提高加载性能。This time is necessary to load the elastic query functionality; loading performance improves with higher service tiers and compute sizes.
  • 尚不支持从 SSMS 或 SSDT 对外部数据源或外部表进行脚本编写。Scripting of external data sources or external tables from SSMS or SSDT is not yet supported.
  • SQL 数据库的导入/导出尚不支持外部数据源和外部表。Import/Export for SQL Database does not yet support external data sources and external tables. 如果需要使用导入/导出,请在导出之前删除这些对象,并在导入后重新创建它们。If you need to use Import/Export, drop these objects before exporting and then re-create them after importing.
  • 弹性查询当前仅支持对外部表进行只读访问。Elastic query currently only supports read-only access to external tables. 但是,可以对在其中定义了外部表的数据库使用完整的 T-SQL 功能。You can, however, use full T-SQL functionality on the database where the external table is defined. 这可能对以下操作很有用:例如,使用(例如)SELECT <column_list> INTO <local_table>持久保存临时结果,或在引用外部表的弹性查询数据库中定义存储过程。This can be useful to, e.g., persist temporary results using, for example, SELECT <column_list> INTO <local_table>, or to define stored procedures on the elastic query database that refer to external tables.
  • 除 nvarchar(max) 外,外部表定义不支持 LOB 类型(包括空间类型)。Except for nvarchar(max), LOB types (including spatial types) are not supported in external table definitions. 一种解决方法是,可以在远程数据库上创建将 LOB 类型强制转换为 nvarchar(max) 的视图,通过该视图(而不是基表)定义外部表,然后在查询中将其强制转换回原始 LOB 类型。As a workaround, you can create a view on the remote database that casts the LOB type into nvarchar(max), define your external table over the view instead of the base table and then cast it back into the original LOB type in your queries.
  • 结果集中的 nvarchar(max) 数据类型列禁用弹性查询实现中使用的高级批处理技术,并且在大量非聚合数据正在作为查询的结果进行传输的非规范用例中可能会影响一个数量级甚至两个数量级的查询性能。Columns of nvarchar(max) data type in result set disable advanced batching technics used in Elastic Query implementation and may affect performance of query for an order of magnitude, or even two orders of magnitude in non-canonical use cases where large amount of non-aggregated data is being transferred as a result of query.
  • 当前不支持外部表上的列统计信息。Column statistics over external tables are currently not supported. 支持表统计信息,但需要手动创建。Table statistics are supported, but need to be created manually.
  • 弹性查询仅适用于 Azure SQL 数据库。Elastic query works with Azure SQL Database only. 无法用它查询 SQL Server 实例。You cannot use it for querying a SQL Server instance.

后续步骤Next steps