What's new in Azure Synapse Analytics?

This page is continuously updated with a recent review of what's new in Azure Synapse Analytics, and also what features are currently in preview.

For older updates, review past Azure Synapse Analytics Blog posts or previous updates in Azure Synapse Analytics.

Features currently in preview

The following table lists the features of Azure Synapse Analytics that are currently in preview. Preview features are sorted alphabetically.

Feature	Learn more
Apache Spark Delta Lake tables in serverless SQL pools	The ability to for serverless SQL pools to access Delta Lake tables created in Spark databases is in preview. For more information, see Azure Synapse Analytics shared metadata tables.
Apache Spark elastic pool storage	Azure Synapse Analytics Spark pools now support elastic pool storage in preview. Elastic pool storage allows the Spark engine to monitor worker node temporary storage and attach more disks if needed. No action is required, and you should see fewer job failures as a result. For more information, see Azure Synapse Analytics Spark elastic pool storage.
Apache Spark R language support	Built-in R support for Apache Spark is now in preview.
Browse ADLS Gen2 folders in the Azure Synapse Analytics workspace	You can now browse an Azure Data Lake Storage Gen2 (ADLS Gen2) container or folder in your Azure Synapse Analytics workspace in Synapse Studio. To learn more, see Browse an ADLS Gen2 folder with ACLs in Azure Synapse Analytics.
Capture changed data from Cosmos DB analytical store	Azure Cosmos DB analytical store now supports change data capture (CDC) for Azure Cosmos DB API for NoSQL and Azure Cosmos DB API for MongoDB. For more information, see Capture Changed Data from your Cosmos DB analytical store and DevBlog: Change Data Capture (CDC) with Azure Cosmos DB analytical store.
Distribution Advisor	The Distribution Advisor is a new preview feature in Azure Synapse dedicated SQL pools Gen2 that analyzes queries and recommends the best distribution strategies for tables to improve query performance. For more information, see Distribution Advisor in Azure Synapse SQL.
Reject options for delimited text files	Reject options for CREATE EXTERNAL TABLE on delimited files is in preview.
Spark Advisor for Azure Synapse Notebook	The Spark Advisor for Azure Synapse Notebook analyzes code run by Spark and displays real-time advice for Notebooks. The Spark advisor offers recommendations for code optimization based on built-in common patterns, performs error analysis, and locates the root cause of failures.
Time-To-Live in managed virtual network (VNet)	Reserve compute for the time-to-live (TTL) in managed virtual network TTL period, saving time and improving efficiency. For more information on this preview, see Announcing public preview of Time-To-Live (TTL) in managed virtual network.
User-Assigned managed identities	Now you can use user-assigned managed identities in linked services for authentication in Synapse Pipelines and Dataflows. To learn more, see Credentials in Azure Data Factory and Azure Synapse.

Generally available features

The following table lists the features of Azure Synapse Analytics that have transitioned from preview to general availability (GA) within the last 12 months.

Month	Feature	Learn more
April 2023	Apache Spark Optimized Write	Optimize Write is a Delta Lake on Azure Synapse feature reduces the number of files written by Apache Spark 3 (3.1 and 3.2) and aims to increase individual file size of the written data.
February 2023	UTF-8 and Japanese collations support for dedicated SQL pools	Both UTF-8 support and Japanese collations are now generally available for dedicated SQL pools.
February 2023	Azure Synapse Runtime for Apache Spark 3.3	The Azure Synapse Runtime for Apache Spark 3.3 is now generally available. Based on our testing using the 1TB TPC-H industry benchmark, you're likely to see up to 77% increased performance.
December 2022	SSIS IR Express virtual network injection	Both the standard and express methods to inject your SSIS Integration Runtime (IR) into a VNet are generally available now. For more information, see General Availability of Express Virtual Network injection for SSIS in Azure Data Factory.
November 2022	Azure Synapse Link for SQL	Azure Synapse Link for SQL is now generally available for both SQL Server 2022 and Azure SQL Database. The Azure Synapse Link for SQL feature provides low- and no-code, near real-time data replication from your SQL-based operational stores into Azure Synapse Analytics. Provide BI reporting on operational data in near real-time, with minimal impact on your operational store. To learn more, visit What is Azure Synapse Link for SQL?
October 2022	SAP CDC connector GA	The data connector for SAP Change Data Capture (CDC) is now GA. For more information, see Announcing Public Preview of the SAP CDC solution in Azure Data Factory and Azure Synapse Analytics and SAP CDC solution in Azure Data Factory.
September 2022	MERGE T-SQL syntax	MERGE T-SQL syntax has been a highly requested addition to the Synapse T-SQL library. As in SQL Server, the MERGE syntax encapsulates INSERTs/UPDATEs/DELETEs into a single high-performance statement. Available in dedicated SQL pools in version 10.0.17829 and above. For more, see the MERGE T-SQL announcement blog.
July 2022	Apache Spark™ 3.2 for Synapse Analytics	Apache Spark™ 3.2 for Synapse Analytics is now generally available. Review the official release notes and migration guidelines between Spark 3.1 and 3.2 to assess potential changes to your applications. For more details, read Apache Spark version support and Azure Synapse Runtime for Apache Spark 3.2. Highlights of what got better in Spark 3.2 in the Azure Synapse Analytics July Update 2022.
July 2022	Apache Spark in Azure Synapse Intelligent Cache feature	Intelligent Cache for Spark automatically stores each read within the allocated cache storage space, detecting underlying file changes and refreshing the files to provide the most recent data. To learn more, see how to Enable/Disable the cache for your Apache Spark pool.
June 2022	Map Data tool	The Map Data tool is a guided process to help you create ETL mappings and mapping data flows from your source data to Synapse without writing code. To learn more about the Map Data tool, read Map Data in Azure Synapse Analytics.
June 2022	User Defined Functions	User defined functions (UDFs) are now generally available. To learn more, read User defined functions in mapping data flows.

Apache Spark for Azure Synapse Analytics

This section summarizes recent new features and capabilities of Apache Spark for Azure Synapse Analytics.

Month	Feature	Learn more
April 2023	Delta Lake - Low Shuffle Merge	Low Shuffle Merge optimization for Delta tables is now available in Apache Spark 3.2 pools. You can now update a Delta table with advanced conditions using the Delta Lake MERGE command.
March 2023	Library management new ability: in-line installation	`%pip` and `%conda` are now available in Apache Spark for Synapse! `%pip` and `%conda` are commands that can be used on Notebooks to install Python packages. For more information, see Manage session-scoped Python packages through %pip and %conda commands.
January 2023	Spark Advisor for Azure Synapse Notebook	The Spark Advisor for Azure Synapse Notebook analyzes code run by Spark and displays real-time advice for Notebooks. The Spark advisor offers recommendations for code optimization based on built-in common patterns, performs error analysis, and locates the root cause of failures.
January 2023	Improve Spark pool utilization with Synapse Genie	The Synapse Genie Framework improves Spark pool utilization by executing multiple Synapse notebooks on the same Spark pool instance. Read more about this metadata-driven utility written in Python.
September 2022	New informative Livy error codes	More precise error codes describe the cause of failure and replaces the previous generic error codes. Previously, all errors in failing Spark jobs surfaced with a generic error code displaying `LIVY_JOB_STATE_DEAD`.
September 2022	New query optimization techniques in Apache Spark for Azure Synapse Analytics	Read the findings from Microsoft's work to gain considerable performance benefits across the board on the reference TPC-DS workload as well as a significant reduction in query plan generation time.
August 2022	Apache Spark elastic pool storage	Azure Synapse Analytics Spark pools now support elastic pool storage in preview. Elastic pool storage allows the Spark engine to monitor worker nodes temporary storage and attach additional disks if needed. No action is required, and you should see fewer job failures as a result. For more information, see Blog: Azure Synapse Analytics Spark elastic pool storage is available for public preview.
August 2022	Apache Spark Optimized Write	Optimize Write is a Delta Lake on Synapse preview feature that reduces the number of files written by Apache Spark 3 (3.1 and 3.2) and aims to increase individual file size of the written data. To learn more, see The need for optimize write on Apache Spark.

Data integration

This section summarizes recent new features and capabilities of Azure Synapse Analytics data integration. Learn how to Load data into Azure Synapse Analytics using Azure Data Factory (ADF) or a Synapse pipeline.

Month	Feature	Learn more
April 2023	Capture changed data from Cosmos DB analytical store (Public Preview)	Azure Cosmos DB analytical store now supports change data capture (CDC) for Azure Cosmos DB API for NoSQL and Azure Cosmos DB API for MongoDB. For more information, see Capture Changed Data from your Cosmos DB analytical store and DevBlog: Change Data Capture (CDC) with Azure Cosmos DB analytical store.
March 2023	Deep dive: Synapse pipelines storage event trigger security	This Customer Success Engineering blog post is a deep dive into Azure Synapse pipelines storage event trigger security. ADF and Synapse Pipelines offer a feature that allows pipeline execution to be triggered based on various events, such as storage blob creation or deletion. This can be used by customers to implement event-driven pipeline orchestration.
January 2023	SQL CDC incremental extract now supports numeric columns	Enabling incremental extract from SQL Server CDC in dataflows allows you to only process rows that have changed since the last time that pipeline was executed. Supported incremental column types now include date/time and numeric columns.
December 2022	Express virtual network injection	Both the standard and express methods to inject your SSIS Integration Runtime (IR) into a VNet are generally available now. For more information, see General Availability of Express Virtual Network injection for SSIS in Azure Data Factory.
October 2022	SAP CDC connector GA	The data connector for SAP Change Data Capture (CDC) is now GA. For more information, see Announcing Public Preview of the SAP CDC solution in Azure Data Factory and Azure Synapse Analytics and SAP CDC solution in Azure Data Factory.
September 2022	Gantt chart view	You can now view your activity runs with a Gantt chart in Azure Data Factory Integration Runtime monitoring.
September 2022	Monitoring improvements	We've released a new bundle of improvements to the monitoring experience based on community feedback.
September 2022	Maximum column optimization in mapping dataflow	For delimited text data sources such as CSVs, a new maximum columns setting allows you to set the maximum number of columns.
September 2022	NUMBER to integer conversion in Oracle data source connector	New property to convert Oracle NUMBER type to a corresponding integer type in source via the new property convertDecimalToInteger. For more information, see the Oracle source connector.
September 2022	Support for sending a body with HTTP request DELETE method in Web activity	New support for sending a body (optional) when using the DELETE method in Web activity. For more information, see the available Type properties for the Web activity.
August 2022	Mapping data flows now support visual Cast transformation	You can use the cast transformation to easily modify the data types of individual columns in a data flow.
August 2022	Default activity timeout changed to 12 hours	The default activity timeout is now 12 hours.
August 2022	Pipeline expression builder ease-of-use enhancements	We've updated our expression builder UI to make pipeline designing easier.
August 2022	New UI for mapping dataflow inline dataset types	We've updated our data flow source UI to make it easier to find your inline dataset type.
July 2022	Time-To-Live in managed virtual network (VNet)	Reserve compute for the time-to-live (TTL) in managed virtual network TTL period, saving time and improving efficiency. For more information on this preview, see Announcing public preview of Time-To-Live (TTL) in managed virtual network.
June 2022	SAP CDC connector preview	A new data connector for SAP Change Data Capture (CDC) is now available in preview. For more information, see Announcing Public Preview of the SAP CDC solution in Azure Data Factory and Azure Synapse Analytics and SAP CDC solution in Azure Data Factory.
June 2022	Fuzzy join option in Join Transformation	Use fuzzy matching with a similarity threshold score slider has been added to the Join transformation in Mapping Data Flows.
June 2022	Map Data tool GA	We're excited to announce that the Map Data tool is now Generally Available. The Map Data tool is a guided process to help you create ETL mappings and mapping data flows from your source data to Synapse without writing code.
June 2022	Rerun pipeline with new parameters	You can now change pipeline parameters when rerunning a pipeline from the Monitoring page without having to return to the pipeline editor. To learn more, read Rerun pipelines and activities.
June 2022	User Defined Functions GA	User defined functions (UDFs) in mapping data flows are now generally available (GA).

Developer experience

This section summarizes recent new quality of life and feature improvements for developers in Azure Synapse Analytics.

Month	Feature	Learn more
December 2022	MSSparkUtils is the Swiss Army knife inside Synapse Spark	MSSparkUtils is a built-in package to help you easily perform common tasks called Microsoft Spark utilities, including the ability to share results between notebooks.
July 2022	Synapse Notebooks compatibility with IPython	The official kernel for Jupyter notebooks is IPython and it's now supported in Synapse Notebooks. For more information, see Synapse Notebooks is now fully compatible with IPython.
July 2022	Mssparkutils now has spark.stop() method	A new API `mssparkutils.session.stop()` has been added to the mssparkutils package. This feature becomes handy when there are multiple sessions running against the same Spark pool. The new API is available for Scala and Python. To learn more, see Stop an interactive session.

Machine Learning

This section summarizes recent new features and improvements to machine learning models in Azure Synapse Analytics.

Month	Feature	Learn more
November 2022	R Support (preview)	Azure Synapse Analytics now provides built-in R support for Apache Spark, currently in preview. For an example, install an R library from CRAN and CRAN snapshots.
August 2022	SynapseML v.0.10.0	New release of SynapseML v0.10.0 (previously MMLSpark), an open-source library that aims to simplify the creation of massively scalable machine learning pipelines. Learn more about the latest additions to SynapseML and get started with SynapseML.
August 2022	.NET support	SynapseML v0.10 adds full support for .NET languages like C# and F#. For a .NET SynapseML example, see .NET Example with LightGBMClassifier.
August 2022	Azure OpenAI Service support	SynapseML now allows users to tap into 175-Billion parameter language models (GPT-3) from OpenAI that can generate and complete text and code near human parity. For more information, see Azure OpenAI for Big Data.
August 2022	MLflow platform support	SynapseML models now integrate with MLflow with full support for saving, loading, deployment, and autologging.
August 2022	SynapseML in Binder	We know that Spark can be intimidating for first users but fear not because with the technology Binder, you can explore and experiment with SynapseML in Binder with zero setup, install, infrastructure, or Azure account required.

Samples and guidance

This section summarizes new guidance and sample project resources for Azure Synapse Analytics.

Month	Feature	Learn more
March 2023	Create a Data Solution on Azure Synapse Analytics with Snapshot Serengeti	This is a four-part series on building an end-to-end data analytics and machine learning solution on Azure Synapse Analytics. The dataset used in this solution is the Snapshot Serengeti dataset, which consists of a large-scale collection of camera trap images.
March 2023	Introduction to Kusto Query Language (KQL)	This Customer Success Engineering blog post provides an introduction to Kusto Query Language (KQL), a powerful query language to analyze large volumes of structured, semi structured and unstructured (Free Text) data.
March 2023	Creating a custom disaster recovery plan for your Synapse workspace	A multi-part blog series on creating a disaster recovery plan for their Synapse Workspace.
March 2023	Azure Synapse connectivity: public endpoints, private endpoints, managed VNet and managed private endpoints	A three-part expert-written blog series on Azure Synapse connectivity for the various networking options, including inbound dedicated pool public endpoint connectivity, Azure Synapse private endpoints, and managed VNet and managed private endpoints.
February 2023	Historical monitoring dashboards for Azure Synapse dedicated SQL pools	A walkthrough of the steps to enable historical monitoring using Azure Monitor Workbook templates on top of Azure Metrics and Azure Log Analytics.
January 2023	Read Data Lake with Synapse Serverless pools	A two-part guide on how to use OPENROWSET to query a path within the lake or use an external table to query a path within the lake.
January 2023	Structured streaming in Synapse Spark	A detailed example of streaming IoT temperature data from IoT devices into Synapse Spark.
January 2023	Create DNS alias for dedicated SQL pool in Synapse workspace for disaster recovery	A custom DNS for dedicated SQL pools (formerly SQL DW) can provide redirect to client programs during a disaster.
December 2022	Azure Synapse - Data Lake vs. Delta Lake vs. Data Lakehouse	Read a new Success Engineering blog post demystifying the terms Data Lake, Delta Lake, and Data Lakehouse.
November 2022	How Data Exfiltration Protection (DEP) impacts Azure Synapse Analytics Pipelines	Data Exfiltration Protection (DEP) is a feature that enables additional restrictions on the ability of Azure Synapse Analytics to connect to other services.
November 2022	Getting started with REST APIs for Azure Synapse Analytics - Apache Spark Pool	We provide instructions on how to setup and use Synapse REST endpoints and describe the Apache Spark Pool operations supported by REST APIs.
November 2022	Synapse Spark Delta Time Travel	Delta Lake time travel enables point-in-time query snapshots or even rolls back erroneous updates.
September 2022	What is the difference between Synapse dedicated SQL pool (formerly SQL DW) and Serverless SQL pool?	Understand dedicated vs serverless pools and their concurrency. Read more at basic concepts of dedicated SQL pools and serverless SQL pools.
September 2022	Reading Delta Lake in dedicated SQL Pool	Sample script to import Delta Lake files directly into the dedicated SQL Pool and support features like time-travel. For an explanation, see Reading Delta Lake in dedicated SQL Pool.
September 2022	Azure Synapse Customer Success Engineering blog series	The new Azure Synapse Customer Success Engineering blog series launches with a detailed introduction to Building the Lakehouse - Implementing a Data Lake Strategy with Azure Synapse.
June 2022	Azure Orbital analytics with Synapse Analytics	We now offer an Azure Orbital analytics sample solution showing an end-to-end implementation of extracting, loading, transforming, and analyzing spaceborne data by using geospatial libraries and AI models with Azure Synapse Analytics. The sample solution also demonstrates how to integrate geospatial-specific Azure AI services models, AI models from partners, and bring-your-own-data models.
June 2022	Azure Synapse success by design	The Azure Synapse proof of concept playbook provides a guide to scope, design, execute, and evaluate a proof of concept for SQL or Spark workloads.

Security

This section summarizes recent new security features and settings in Azure Synapse Analytics.

Month	Feature	Learn more
December 2022	How Data Exfiltration Protection (DEP) impacts Azure Synapse Analytics Pipelines	Data Exfiltration Protection (DEP) is a feature that enables additional restrictions on the ability of Azure Synapse Analytics to connect to other services.
August 2022	Execute Azure Synapse Spark Notebooks with system-assigned managed identity	You can now execute Spark Notebooks with the system-assigned managed identity (or workspace managed identity) by enabling Run as managed identity from the Configure session menu. With this feature, you are able to validate that your notebook works as expected when using the system-assigned managed identity, before using the notebook in a pipeline. For more information, see Managed identity for Azure Synapse.

Azure Synapse Link

Azure Synapse Link is an automated system for replicating data from SQL Server or Azure SQL Database, Azure Cosmos DB, or Dataverse into Azure Synapse Analytics. This section summarizes recent news about the Azure Synapse Link feature.

Month	Feature	Learn more
November 2022	Azure Synapse Link for SQL	Azure Synapse Link for SQL is now generally available for both SQL Server 2022 and Azure SQL Database. The Azure Synapse Link for SQL feature provides low- and no-code, near real-time data replication from your SQL-based operational stores into Azure Synapse Analytics. Provide BI reporting on operational data in near real-time, with minimal impact on your operational store. For more information, see What is Azure Synapse Link for SQL?
July 2022	Batch mode	Decide between cost and latency in Azure Synapse Link for SQL by selecting continuous or batch mode to replicate your data. Batch mode allows you to save even more on costs by only paying for ingestion service during the batch loads instead of it being continuously on. You can select between 20 and 60 minutes for batch processing.

Synapse SQL

This section summarizes recent improvements and features in SQL pools in Azure Synapse Analytics.

Month	Feature	Learn more
June 2023	Updated diagnostic settings fields	Nine fields have been added to the dedicated SQL pool diagnostic settings logs.
March 2023	Create alerts for your Azure Synapse dedicated SQL pool	This Customer Success Engineering blog post provides steps to configure alerts for your Azure Synapse dedicated SQL pool and provide recommended alerts to get you started.
March 2023	Performance Tuning Synapse Dedicated Pools - Understanding the Query Lifecycle	This Customer Success Engineering blog post is a deep dive into Understanding Query Lifecycle to Maximize Performance.
March 2023	GREATEST and LEAST T-SQL syntax support	GREATEST and LEAST functions are now available in both serverless and dedicated SQL pools. These scalar-valued functions and return the maximum and minimum value out of a list of one or more expressions.
February 2023	UTF-8 and Japanese collations support for dedicated SQL pools	Both UTF-8 support and Japanese collations are now generally available for dedicated SQL pools.
September 2022	Auto-statistics for OPENROWSET in CSV datasets	Serverless SQL pool will automatically create statistics for CSV datasets when needed to ensure an optimal query execution plan for OPENROWSET queries.
September 2022	MERGE T-SQL syntax	T-SQL MERGE syntax has been a highly requested addition to the Synapse T-SQL library. MERGE encapsulates INSERTs/UPDATEs/DELETEs into a single statement. Available in dedicated SQL pools in version 10.0.17829 and above. For more, see the MERGE T-SQL announcement blog.
August 2022	Apache Spark Delta Lake tables in serverless SQL pools	The ability to for serverless SQL pools to access Delta Lake tables created in Spark databases is in preview. For more information, see Azure Synapse Analytics shared metadata tables.
August 2022	Multi-column distribution in dedicated SQL pools	You can now Hash Distribute tables on multiple columns for a more even distribution of the base table, reducing data skew over time and improving query performance. For more information on opting-in to the preview, see CREATE TABLE distribution options or CREATE TABLE AS SELECT distribution options.
August 2022	Distribution Advisor	The Distribution Advisor is a new preview feature in Azure Synapse dedicated SQL pools Gen2 that analyzes queries and recommends the best distribution strategies for tables to improve query performance. For more information, see Distribution Advisor in Azure Synapse SQL.
August 2022	Add SQL objects and users in Lake databases	New capabilities announced for lake databases in serverless SQL pools: create schemas, views, procedures, inline table-valued functions. You can also database users from your Azure Active Directory domain and assign them to the db_datareader role. For more information, see Access lake databases using serverless SQL pool in Azure Synapse Analytics and Create and use native external tables using SQL pools in Azure Synapse Analytics.

Learn more