什么是 Azure Synapse Analytics(以前称为 SQL DW)?What is Azure Synapse Analytics (formerly SQL DW)?

Azure Synapse 是一种分析服务,它将企业数据仓库和大数据分析结合在一起。Azure Synapse is an analytics service that brings together enterprise data warehousing and Big Data analytics. 借助它可以使用无服务器的按需资源或预配资源,任意执行自己定义的大规模数据查询。It gives you the freedom to query data on your terms, using either serverless on-demand or provisioned resources-at scale. Azure Synapse 将这两个领域结合在一起,提供统一的体验来引入、准备、管理和处理数据。Azure Synapse brings these two worlds together with a unified experience to ingest, prepare, manage, and serve data.

Azure Synapse 包含四个组件:Azure Synapse has four components:

  • Synapse SQL:基于 T-SQL 的完整分析 - 正式版Synapse SQL: Complete T-SQL based analytics - Generally Available
    • SQL 池(按预配的 DWU 付费)SQL pool (pay per DWU provisioned)
    • SQL 按需(按处理的 TB 付费)(预览)SQL on-demand (pay per TB processed) (preview)
  • Spark:深度集成的 Apache Spark(预览)Spark: Deeply integrated Apache Spark (preview)
  • Synapse 管道:混合数据集成(预览)Synapse Pipelines: Hybrid data integration (preview)
  • 工作室:统一的用户体验。Studio: Unified user experience. (预览版)(preview)

Azure Synapse 中的 Synapse SQL 池Synapse SQL pool in Azure Synapse

Synapse SQL 池是指 Azure Synapse 中正式发布的企业数据仓库功能。Synapse SQL pool refers to the enterprise data warehousing features that are generally available in Azure Synapse.

SQL 池表示使用 Synapse SQL 时预配的分析资源集合。SQL pool represents a collection of analytic resources that are being provisioned when using Synapse SQL. SQL 池的大小由数据仓库单位 (DWU) 决定。The size of SQL pool is determined by Data Warehousing Units (DWU).

使用简单的 PolyBase T-SQL 查询导入大数据,然后利用 MPP 的功能运行高性能分析。Import big data with simple PolyBase T-SQL queries, and then use the power of MPP to run high-performance analytics. 进行集成和分析时,Synapse SQL 池将成为企业赖以获取更快且更可靠的见解的唯一信息源。As you integrate and analyze, Synapse SQL pool will become the single version of truth your business can count on for faster and more robust insights.

大数据解决方案的关键组件Key component of a big data solution

数据仓库是基于云的端到端大数据解决方案的关键组件。Data warehousing is a key component of a cloud-based, end-to-end big data solution.

数据仓库解决方案

在云数据解决方案中,可从各种源将数据引入大数据存储中。In a cloud data solution, data is ingested into big data stores from a variety of sources. 将数据置于大数据存储中以后,Hadoop、Spark 和机器学习算法就可以准备和训练数据。Once in a big data store, Hadoop, Spark, and machine learning algorithms prepare and train the data. 当数据可供进行复杂的分析时,Synapse SQL 池就会使用 PolyBase 来查询大数据存储。When the data is ready for complex analysis, Synapse SQL pool uses PolyBase to query the big data stores. PolyBase 使用标准 T-SQL 查询将数据引入 Synapse SQL 池表中。PolyBase uses standard T-SQL queries to bring the data into Synapse SQL pool tables.

Synapse SQL 池通过分列存储将数据存储到关系表中。Synapse SQL pool stores data in relational tables with columnar storage. 此格式可显著降低数据存储费用,改进查询性能。This format significantly reduces the data storage costs, and improves query performance. 存储数据后,即可大规模地运行分析。Once data is stored, you can run analytics at massive scale. 与传统数据库系统相比,数分钟的分析查询只需数秒即可完成,数天的查询只需数小时。Compared to traditional database systems, analysis queries finish in seconds instead of minutes, or hours instead of days.

分析结果可以传输到世界各地的报告数据库或应用程序。The analysis results can go to worldwide reporting databases or applications. 然后即可通过业务分析获得进行明智的业务决策所需的见解。Business analysts can then gain insights to make well-informed business decisions.

后续步骤Next steps