What's New in Azure Synapse Analytics Archive

This article describes previous month updates to Azure Synapse Analytics. For the most current month's release, check out Azure Synapse Analytics latest updates. Each update links to the Azure Synapse Analytics blog and an article that provides more information.

Generally available features

The following table lists a past history of the features of Azure Synapse Analytics that have transitioned from preview to general availability (GA).

Month	Feature	Learn more
July 2022	Apache Spark™ 3.2 for Synapse Analytics	Apache Spark™ 3.2 for Synapse Analytics is now generally available. Review the official release notes and migration guidelines between Spark 3.1 and 3.2 to assess potential changes to your applications. For more details, read Apache Spark version support and Azure Synapse Runtime for Apache Spark 3.2. Highlights of what got better in Spark 3.2 in the Azure Synapse Analytics July Update 2022.
July 2022	Apache Spark in Azure Synapse Intelligent Cache feature	Intelligent Cache for Spark automatically stores each read within the allocated cache storage space, detecting underlying file changes and refreshing the files to provide the most recent data. To learn more, see how to Enable/Disable the cache for your Apache Spark pool.
June 2022	Map Data tool	The Map Data tool is a guided process to help you create ETL mappings and mapping data flows from your source data to Synapse without writing code. To learn more about the Map Data tool, read Map Data in Azure Synapse Analytics.
June 2022	User Defined Functions	User defined functions (UDFs) are now generally available. To learn more, read User defined functions in mapping data flows.
April 2022	Cross-subscription restore for Azure Synapse SQL	With the PowerShell `Az.Sql` module 3.8 update, the Restore-AzSqlDatabase cmdlet can be used for cross-subscription restore of dedicated SQL pools. To learn more, see Blog: Restore a dedicated SQL pool (formerly SQL DW) to a different subscription. This feature is now generally available for dedicated SQL pools (formerly SQL DW) and dedicated SQL pools in a Synapse workspace. What's the difference?
April 2022	Database Designer	The database designer allows users to visually create databases within Synapse Studio without writing a single line of code. For more information, see Announcing General Availability of Database Designer. Read more about lake databases and learn How to modify an existing lake database using the database designer.
April 2022	Synapse Monitoring Operator RBAC role	The Synapse Monitoring Operator RBAC (role-based access control) role allows a user persona to monitor the execution of Synapse Pipelines and Spark applications without having the ability to run or cancel the execution of these applications. For more information, review the Synapse RBAC Roles.
March 2022	Flowlets	Flowlets help you design portions of new data flow logic, or to extract portions of an existing data flow, and save them as separate artifact inside your Synapse workspace. Then, you can reuse these Flowlets can inside other data flows. To learn more, review the Flowlets GA announcement blog post and read Flowlets in mapping data flow.
March 2022	Change Feed connectors	Changed data capture (CDC) feed data flow source transformations for Azure Cosmos DB, Azure Blob Storage, ADLS Gen2, and Common Data Model (CDM) are now generally available. By simply checking a box, you can tell ADF to manage a checkpoint automatically for you and only read the latest rows that were updated or inserted since the last pipeline run. To learn more, review the Change Feed connectors GA preview blog post and read Copy and transform data in Azure Data Lake Storage Gen2 using Azure Data Factory or Azure Synapse Analytics.
March 2022	Column level encryption for dedicated SQL pools	Column level encryption is now generally available for use on new and existing Azure SQL logical servers with Azure Synapse dedicated SQL pools and dedicated SQL pools in Azure Synapse workspaces. SQL Server Data Tools (SSDT) support for column level encryption for the dedicated SQL pools is available starting with the 17.2 Preview 2 build of Visual Studio 2022.
March 2022	Synapse Spark Common Data Model (CDM) connector	The CDM format reader/writer enables a Spark program to read and write CDM entities in a CDM folder via Spark dataframes. To learn more, see how the CDM connector supports reading, writing data, examples, & known issues.
November 2021	PREDICT	The T-SQL PREDICT syntax is now generally available for dedicated SQL pools. Get started with the Machine learning model scoring wizard for dedicated SQL pools.
October 2021	Synapse RBAC Roles	Synapse role-based access control (RBAC) roles are now generally available. Learn more about Synapse RBAC roles and Azure Synapse role-based access control (RBAC) using PowerShell.

Apache Spark for Azure Synapse Analytics

This section is an archive of features and capabilities of Apache Spark for Azure Synapse Analytics.

Month	Feature	Learn more
May 2022	Azure Synapse dedicated SQL pool connector for Apache Spark now available in Python	Previously, the Azure Synapse Dedicated SQL Pool Connector for Apache Spark was only available using Scala. Now, the dedicated SQL pool connector for Apache Spark can be used with Python on Spark 3.
May 2022	Manage Azure Synapse Apache Spark configuration	With the new Apache Spark configurations feature, you can create a standalone Spark configuration artifact with auto-suggestions and built-in validation rules. The Spark configuration artifact allows you to share your Spark configuration within and across Azure Synapse workspaces. You can also easily associate your Spark configuration with a Spark pool, a Notebook, and a Spark job definition for reuse and minimize the need to copy the Spark configuration in multiple places.
April 2022	Apache Spark 3.2 for Synapse Analytics	Apache Spark 3.2 for Synapse Analytics with preview availability. Review the official Spark 3.2 release notes and migration guidelines between Spark 3.1 and 3.2 to assess potential changes to your applications. For more details, read Apache Spark version support and Azure Synapse Runtime for Apache Spark 3.2.
April 2022	Parameterization for Spark job definition	You can now assign parameters dynamically based on variables, metadata, or specifying Pipeline specific parameters for the Spark job definition activity. For more details, read Transform data using Apache Spark job definition.
April 2022	Apache Spark notebook snapshot	You can access a snapshot of the Notebook when there's a Pipeline Notebook run failure or when there's a long-running Notebook job. To learn more, read Transform data by running a Synapse notebook and Introduction to Microsoft Spark utilities.
March 2022	Synapse Spark Common Data Model (CDM) connector	The CDM format reader/writer enables a Spark program to read and write CDM entities in a CDM folder via Spark dataframes. To learn more, see how the CDM connector supports reading, writing data, examples, & known issues.
March 2022	Performance optimization for Synapse Spark dedicated SQL pool connector	New improvements to the Azure Synapse Dedicated SQL Pool Connector for Apache Spark reduce data movement and leverage `COPY INTO`. Performance tests indicated at least ~5x improvement over the previous version. No action is required from the user to leverage these enhancements. For more information, see Blog: Synapse Spark Dedicated SQL Pool (DW) Connector: Performance Improvements.
March 2022	Support for all Spark Dataframe SaveMode choices	The Azure Synapse Dedicated SQL Pool Connector for Apache Spark now supports all four Spark Dataframe SaveMode choices: Append, Overwrite, ErrorIfExists, Ignore. For more information on Spark SaveMode, read the official Apache Spark documentation.
March 2022	Apache Spark in Azure Synapse Analytics Intelligent Cache feature	Intelligent Cache for Spark automatically stores each read within the allocated cache storage space, detecting underlying file changes and refreshing the files to provide the most recent data. To learn more on this preview feature, see how to Enable/Disable the cache for your Apache Spark pool or see the blog post.

Data integration

This section is an archive of features and capabilities of Azure Synapse Analytics data integration. Learn how to Load data into Azure Synapse Analytics using Azure Data Factory (ADF) or a Synapse pipeline.

Month	Feature	Learn more
June 2022	SAP CDC connector preview	A new data connector for SAP Change Data Capture (CDC) is now available in preview. For more information, see Announcing Public Preview of the SAP CDC solution in Azure Data Factory and Azure Synapse Analytics and SAP CDC solution in Azure Data Factory.
June 2022	Fuzzy join option in Join Transformation	Use fuzzy matching with a similarity threshold score slider has been added to the Join transformation in Mapping Data Flows.
June 2022	Map Data tool GA	We're excited to announce that the Map Data tool is now Generally Available. The Map Data tool is a guided process to help you create ETL mappings and mapping data flows from your source data to Synapse without writing code.
June 2022	Rerun pipeline with new parameters	You can now change pipeline parameters when rerunning a pipeline from the Monitoring page without having to return to the pipeline editor. To learn more, read Rerun pipelines and activities.
June 2022	User Defined Functions GA	User defined functions (UDFs) in mapping data flows are now generally available (GA).
May 2022	Export pipeline monitoring as a CSV	The ability to export pipeline monitoring to CSV and other monitoring improvements have been introduced to ADF.
May 2022	Automatic incremental source data loading from PostgreSQL and MySQL	Automatic incremental source data loading from PostgreSQL and MySQL to Synapse SQL and Azure Database is now natively available in ADF.
May 2022	Assert transformation error handling	Error handling has now been added to sinks following an assert transformation in mapping data flow. You can now choose whether to output the failed rows to the selected sink or to a separate file.
May 2022	Mapping data flows projection editing	In mapping data flows, you can now update source projection column names and column types.
April 2022	Dataverse connector for Synapse Data Flows	Dataverse is now a source and sink connector to Synapse Data Flows. You can Copy and transform data from Dynamics 365 (Microsoft Dataverse) or Dynamics CRM using Azure Data Factory or Azure Synapse Analytics.
April 2022	Configurable Synapse Pipelines Web activity response timeout	With the response timeout property `httpRequestTimeout`, you can define a timeout for the HTTP request up to 10 minutes. Web activities work exceptionally well with APIs that follow the asynchronous request-reply pattern, a suggested approach for building scalable web APIs/services.
March 2022	sFTP connector for Synapse data flows	A native sftp connector in Synapse data flows is supported to read and write data from sFTP using the visual low-code data flows interface in Synapse. To learn more, see Copy and transform data in SFTP server using Azure Data Factory or Azure Synapse Analytics.
March 2022	Data flow improvements to Data Preview	Review features added to the Data Preview and debug improvements in Mapping Data Flows.
March 2022	Pipeline script activity	You can now Transform data by using the Script activity to invoke SQL commands to perform both DDL and DML.
December 2021	Custom partitions for Synapse link for Azure Cosmos DB	Improve query execution times for your Spark queries, by creating custom partitions based on fields frequently used in your queries. To learn more, see Custom partitioning in Azure Synapse Link for Azure Cosmos DB (Preview).

Database Designer

This section is an archive of features and capabilities of the database designer.

Month	Feature	Learn more
April 2022	Database Designer	The database designer allows users to visually create databases within Synapse Studio without writing a single line of code. For more information, see Announcing General Availability of Database Designer. Read more about lake databases and learn How to modify an existing lake database using the database designer.
April 2022	Clone lake database	In Synapse Studio, you can now clone a database using the action menu available on the lake database. To learn more, read How-to: Clone a lake database.
April 2022	Use wildcards to specify custom folder hierarchies	Lake databases sit on top of data that is in the lake and this data can live in nested folders that don't fit into clean partition patterns. You can now use wildcards to specify custom folder hierarchies. To learn more, read How-to: Modify a datalake.

Developer experience

This section is an archive of quality of life and feature improvements for developers in Azure Synapse Analytics.

Month	Feature	Learn more
May 2022	Updated Azure Synapse Analyzer Report	Learn about the new features in version 2.0 of the Synapse Analyzer report.
April 2022	Azure Synapse Analyzer Report	The Azure Synapse Analyzer Report helps you identify common issues that may be present in your database that can lead to performance issues.
April 2022	Reference unpublished notebooks	Now, when using %run notebooks, you can enable 'unpublished notebook reference', which will allow you to reference unpublished notebooks. When enabled, notebook run will fetch the current contents in the notebook web cache, meaning the changes in your notebook editor can be referenced immediately by other notebooks without having to be published (Live mode).
March 2022	Code cells with exception to show standard output	Now in Synapse notebooks, both standard output and exception messages are shown when a code statement fails for Python and Scala languages. For examples, see Synapse notebooks: Code cells with exception to show standard output.
March 2022	Partial output is available for running notebook code cells	Now in Synapse notebooks, you can see anything you write (with `println` commands, for example) as the cell executes, instead of waiting until it ends. For examples, see Synapse notebooks: Partial output is available for running notebook code cells .
March 2022	Dynamically control your Spark session configuration with pipeline parameters	Now in Synapse notebooks, you can use pipeline parameters to configure the session with the notebook %%configure magic. For examples, see Synapse notebooks: Dynamically control your Spark session configuration with pipeline parameters.
March 2022	Reuse and manage notebook sessions	Now in Synapse notebooks, it's easy to reuse an active session conveniently without having to start a new one and to see and manage your active sessions in the Active sessions list. To view your sessions, select the 3 dots in the notebook and select Manage sessions. For examples, see Synapse notebooks: Reuse and manage notebook sessions.
March 2022	Support for Python logging	Now in Synapse notebooks, anything written through the Python logging module is captured, in addition to the driver logs. For examples, see Synapse notebooks: Support for Python logging.

Machine Learning

This section is an archive of features and improvements to machine learning models in Azure Synapse Analytics.

Month	Feature	Learn more
November 2021	PREDICT	The T-SQL PREDICT syntax is now generally available for dedicated SQL pools. Get started with the Machine learning model scoring wizard for dedicated SQL pools.

Samples and guidance

This section is an archive of guidance and sample project resources for Azure Synapse Analytics.

Month	Feature	Learn more
June 2022	Azure Orbital analytics with Synapse Analytics	We now offer an Azure Orbital analytics sample solution showing an end-to-end implementation of extracting, loading, transforming, and analyzing spaceborne data by using geospatial libraries and AI models with Azure Synapse Analytics. The sample solution also demonstrates how to integrate geospatial-specific Azure AI services models, AI models from partners, and bring-your-own-data models.
June 2022	Azure Synapse success by design	The Azure Synapse proof of concept playbook provides a guide to scope, design, execute, and evaluate a proof of concept for SQL or Spark workloads.

Security

This section is an archive of security features and settings in Azure Synapse Analytics.

Month	Feature	Learn more
April 2022	Synapse Monitoring Operator RBAC role	The Synapse Monitoring Operator role-based access control (RBAC) role allows a user persona to monitor the execution of Synapse Pipelines and Spark applications without having the ability to run or cancel the execution of these applications. For more information, review the Synapse RBAC Roles.
March 2022	Enforce minimal TLS version	You can now raise or lower the minimum TLS version for dedicated SQL pools in Synapse workspaces. To learn more, see Azure SQL connectivity settings. The workspace managed SQL API can be used to modify the minimum TLS settings.
March 2022	Azure Synapse Analytics now supports Azure Active Directory (Azure AD) only authentication	You can now use Azure Active Directory authentication to centrally manage access to all Azure Synapse resources, including SQL pools. You can disable local authentication upon creation or after a workspace is created through the Azure portal.
December 2021	User-Assigned managed identities	Now you can use user-assigned managed identities in linked services for authentication in Synapse Pipelines and Dataflows. To learn more, see Credentials in Azure Data Factory and Azure Synapse.
December 2021	Browse ADLS Gen2 folders in the Azure Synapse Analytics workspace	You can now browse and secure an Azure Data Lake Storage Gen2 (ADLS Gen2) container or folder in your Azure Synapse Analytics workspace by connecting to a specific container or folder in Synapse Studio.
December 2021	TLS 2.1 enforced for new Synapse Workspaces	Starting in December 2021, a requirement for TLS 1.2 has been implemented for new Synapse Workspaces only.

Azure Synapse Link

Azure Synapse Link is an automated system for replicating data from SQL Server or Azure SQL Database, Azure Cosmos DB, or Dataverse into Azure Synapse Analytics. This section is an archive of news about the Azure Synapse Link feature.

Month	Feature	Learn more
May 2022	Azure Synapse Link for SQL preview	Azure Synapse Link for SQL is in preview for both SQL Server 2022 and Azure SQL Database. The Azure Synapse Link feature provides low- and no-code, near real-time data replication from your SQL-based operational stores into Azure Synapse Analytics. Provide BI reporting on operational data in near real-time, with minimal impact on your operational store. The Azure Synapse Link for SQL preview has been announced. For more information, see Blog: Azure Synapse Link for SQL Deep Dive.

Synapse SQL

This section is an archive of improvements and features in SQL pools in Azure Synapse Analytics.

Month	Feature	Learn more
June 2022	Result set size limit increase	The maximum size of query result sets in serverless SQL pools has been increased from 200 GB to 400 GB.
May 2022	Automatic character column length calculation for serverless SQL pools	It's no longer necessary to define character column lengths for serverless SQL pools in the data lake. You can get optimal query performance without having to define the schema, because the serverless SQL pool will use automatically calculated average column lengths and cardinality estimation.
April 2022	Cross-subscription restore for Azure Synapse SQL GA	With the PowerShell `Az.Sql` module 3.8 update, the Restore-AzSqlDatabase cmdlet can be used for cross-subscription restore of dedicated SQL pools. To learn more, see Restore a dedicated SQL pool to a different subscription. This feature is now generally available for dedicated SQL pools (formerly SQL DW) and dedicated SQL pools in a Synapse workspace. What's the difference?
April 2022	Recover SQL pool from dropped server or workspace	With the PowerShell Restore cmdlets in `Az.Sql` and `Az.Synapse` modules, you can now restore from a deleted server or workspace without filing a support ticket. For more information, see Restore a dedicated SQL pool from a deleted Azure Synapse workspace or Restore a standalone dedicated SQL pools (formerly SQL DW) from a deleted server, depending on your scenario.
March 2022	Column level encryption for dedicated SQL pools	Column level encryption is now generally available for use on new and existing Azure SQL logical servers with Azure Synapse dedicated SQL pools and dedicated SQL pools in Azure Synapse workspaces. SQL Server Data Tools (SSDT) support for column level encryption for the dedicated SQL pools is available starting with the 17.2 Preview 2 build of Visual Studio 2022.
March 2022	Parallel execution for CETAS	Better performance for CREATE TABLE AS SELECT (CETAS) and subsequent SELECT statements now made possible by use of parallel execution plans. For examples, see Better performance for CETAS and subsequent SELECTs.

Previous monthly updates in Azure Synapse Analytics

What follows are the previous format of monthly news updates for Synapse Analytics.

June 2022 update

General

Azure Orbital analytics with Synapse Analytics - We now offer an Azure Orbital analytics sample solution showing an end-to-end implementation of extracting, loading, transforming, and analyzing spaceborne data by using geospatial libraries and AI models with Azure Synapse Analytics. The sample solution also demonstrates how to integrate geospatial-specific Azure AI services models, AI models from partners, and bring-your-own-data models.
Azure Synapse success by design - Project success is no accident and requires careful planning and execution. The Synapse Analytics' Success by Design playbooks are now available. The Azure Synapse proof of concept playbook provides a guide to scope, design, execute, and evaluate a proof of concept for SQL or Spark workloads. These guides contain best practices from the most challenging and complex solution implementations incorporating Azure Synapse. To learn more about the Azure Synapse proof of concept playbook, read Success by Design.

SQL

Result set size limit increase - We know that you turn to Azure Synapse Analytics to work with large amounts of data. With that in mind, the maximum size of query result sets in Serverless SQL pools has been increased from 200 GB to 400 GB. This limit is shared between concurrent queries. To learn more about this size limit increase and other constraints, read Self-help for serverless SQL pool.

Data integration

Fuzzy Join option in Join Transformation - Fuzzy matching with a sliding similarity score option has been added to the Join transformation in Mapping Data Flows. You can create inner and outer joins on data values that are similar rather than exact matches! Previously, you would have had to use an exact match. The sliding scale value goes from 60% to 100%, making it easy to adjust the similarity threshold of the match. For learn more about fuzzy joins, read Join transformation in mapping data flow.
Map Data [Generally Available] - We're excited to announce that the Map Data tool is now Generally Available. The Map Data tool is a guided process to help you create ETL mappings and mapping data flows from your source data to Synapse without writing code. To learn more about Map Data, read Map Data in Azure Synapse Analytics.
Rerun pipeline with new parameters - You can now change pipeline parameters when rerunning a pipeline from the Monitoring page without having to return to the pipeline editor. After running a pipeline with new parameters, you can easily monitor the new run against the old ones without having to toggle between pages. To learn more about rerunning pipelines with new parameters, read Rerun pipelines and activities.
User Defined Functions [Generally Available] - We're excited to announce that user defined functions (UDFs) are now Generally Available. With user-defined functions, you can create customized expressions that can be reused across multiple mapping data flows. You no longer have to use the same string manipulation, math calculations, or other complex logic several times. User-defined functions will be grouped in libraries to help developers group common sets of functions. To learn more about user defined functions, read User defined functions in mapping data flows.

May 2022 update

The following updates are new to Azure Synapse Analytics this month.

SQL

Automatic character column length calculation - It's no longer necessary to define character column lengths! Serverless SQL pools let you query files in the data lake without knowing the schema upfront. The best practice was to specify the lengths of character columns to get optimal performance. Not anymore! With this new feature, you can get optimal query performance without having to define the schema. The serverless SQL pool will calculate the average column length for each inferred character column or character column defined as larger than 100 bytes. The schema will stay the same, while the serverless SQL pool will use the calculated average column lengths internally. It will also automatically calculate the cardinality estimation in case there was no previously created statistic.

Apache Spark for Synapse

Azure Synapse Dedicated SQL Pool Connector for Apache Spark Now Available in Python - Previously, the Azure Synapse Dedicated SQL Pool connector was only available using Scala. Now, it can be used with Python on Spark 3. The only difference between the Scala and Python implementations is the optional Scala callback handle, which allows you to receive post-write metrics.

The following are now supported in Python on Spark 3:
- Read using Azure Active Directory (AD) Authentication or Basic Authentication
- Write to Internal Table using Azure AD Authentication or Basic Authentication
- Write to External Table using Azure AD Authentication or Basic Authentication
To learn more about the connector in Python, read Azure Synapse Dedicated SQL Pool Connector for Apache Spark.
Manage Azure Synapse Apache Spark configuration - Apache Spark configuration management is always a challenging task because Spark has hundreds of properties. It is also challenging for you to know the optimal value for Spark configurations. With the new Spark configuration management feature, you can create a standalone Spark configuration artifact with auto-suggestions and built-in validation rules. The Spark configuration artifact allows you to share your Spark configuration within and across Azure Synapse workspaces. You can also easily associate your Spark configuration with a Spark pool, a Notebook, and a Spark job definition for reuse and minimize the need to copy the Spark configuration in multiple places. To learn more about the new Spark configuration management feature, read Manage Apache Spark configuration.

Data Integration

Export pipeline monitoring as a CSV - The ability to export pipeline monitoring to CSV has been added after receiving many community requests for the feature. Simply filter the Pipeline runs screen to the data you want and select Export to CSV*. To learn more about exporting pipeline monitoring and other monitoring improvements, read Azure Data Factory monitoring improvements.
Incremental data loading made easy for Synapse and Azure Database for PostgreSQL and MySQL - In a data integration solution, incrementally loading data after an initial full data load is a widely used scenario. Automatic incremental source data loading is now natively available for Synapse SQL and Azure Database for PostgreSQL and MySQL. Users can "enable incremental extract" and only inserted or updated rows will be read by the pipeline. To learn more about incremental data loading, read Incrementally copy data from a source data store to a destination data store.
User-Defined Functions for Mapping Data Flows [Public Preview] - We hear you that you can find yourself doing the same string manipulation, math calculations, or other complex logic several times. Now, with the new user-defined function feature, you can create customized expressions that can be reused across multiple mapping data flows. User-defined functions will be grouped in libraries to help developers group common sets of functions. Once you've created a data flow library, you can add in your user-defined functions. You can even add in multiple arguments to make your function more reusable. To learn more about user-defined functions, read User defined functions in mapping data flows.
Assert Error Handling - Error handling has now been added to sinks following an assert transformation. Assert transformations enable you to build custom rules for data quality and data validation. You can now choose whether to output the failed rows to the selected sink or to a separate file. To learn more about error handling, read Assert data transformation in mapping data flow.
Mapping data flows projection editing - New UI updates have been made to source projection editing in mapping data flows. You can now update source projection column names and column types. To learn more about source projection editing, read Source transformation in mapping data flow.

Azure Synapse Link

Azure Synapse Link for SQL Server - At Microsoft Build 2022, we announced the Public Preview availability of Azure Synapse Link for SQL, for both SQL Server 2022 and Azure SQL Database. Data-driven, quality insights are critical for companies to stay competitive. The speed to achieve those insights can make all the difference. The costly and time-consuming nature of traditional ETL and ELT pipelines is no longer enough. With this release, you can now take advantage of low- and no-code, near real-time data replication from your SQL-based operational stores into Azure Synapse Analytics. This makes it easier to run BI reporting on operational data in near real-time, with minimal impact on your operational store. To learn more, read Announcing the Public Preview of Azure Synapse Link for SQL.

Apr 2022 update

The following updates are new to Azure Synapse Analytics this month.

SQL

Cross-subscription restore for Azure Synapse SQL is now generally available. Previously, it took many undocumented steps to restore a dedicated SQL pool to another subscription. Now, with the PowerShell Az.Sql module 3.8 update, the Restore-AzSqlDatabase cmdlet can be used for cross-subscription restore. To learn more, see Restore a dedicated SQL pool (formerly SQL DW) to a different subscription.
It is now possible to recover a SQL pool from a dropped server or workspace. With the PowerShell Restore cmdlets in Az.Sql and Az.Synapse modules, you can now restore from a deleted server or workspace without filing a support ticket. For more information, read Synapse workspace SQL pools or standalone SQL pools (formerly SQL DW), depending on your scenario.

Synapse database designer

We've added the option to clone a lake database. This unlocks additional opportunities to manage new versions of databases or support schemas that evolve in discrete steps. You can quickly clone a database using the action menu available on the lake database. To learn more, read How-to: Clone a lake database.
You can now use wildcards to specify custom folder hierarchies. Lake databases sit on top of data that is in the lake and this data can live in nested folders that don't fit into clean partition patterns. Previously, querying lake databases required that your data exists in a simple directory structure that you could browse using the folder icon without the ability to manually specify directory structure or use wildcard characters. To learn more, read How-to: Modify a datalake.

Apache Spark for Synapse

We are excited to announce the preview availability of Apache Spark™ 3.2 on Synapse Analytics. This new version incorporates user-requested enhancements and resolves 1,700+ Jira tickets. Please review the official release notes for the complete list of fixes and features and review the migration guidelines between Spark 3.1 and 3.2 to assess potential changes to your applications. For more details, read Apache Spark version support and Azure Synapse Runtime for Apache Spark 3.2.
Assigning parameters dynamically based on variables, metadata, or specifying Pipeline specific parameters has been one of your top feature requests. Now, with the release of parameterization for the Spark job definition activity, you can do just that. For more details, read Transform data using Apache Spark job definition.
We often receive customer requests to access the snapshot of the Notebook when there is a Pipeline Notebook run failure or there is a long-running Notebook job. With the release of the Synapse Notebook snapshot feature, you can now view the snapshot of the Notebook activity run with the original Notebook code, the cell output, and the input parameters. You can also access the snapshot of the referenced Notebook from the referencing Notebook cell output if you refer to other Notebooks through Spark utils. To learn more, read Transform data by running a Synapse notebook and Introduction to Microsoft Spark utilities.

Security

The Synapse Monitoring Operator RBAC role is now generally available. Since the GA of Synapse, customers have asked for a fine-grained RBAC (role-based access control) role that allows a user persona to monitor the execution of Synapse Pipelines and Spark applications without having the ability to run or cancel the execution of these applications. Now, customers can assign the Synapse Monitoring Operator role to such monitoring personas. This allows organizations to stay compliant while having flexibility in the delegation of tasks to individuals or teams. Learn more by reading Synapse RBAC Roles.

Data integration

Azure has added Dataverse as a source and sink connector to Synapse Data Flows so that you can now build low-code data transformation ETL jobs in Synapse directly accessing your Dataverse environment. For more details on how to use this new connector, read Mapping data flow properties.
We heard from you that a 1-minute timeout for Web activity was not long enough, especially in cases of synchronous APIs. Now, with the response timeout property 'httpRequestTimeout', you can define timeout for the HTTP request up to 10 minutes. Learn more by reading Web activity response timeout improvements.

Developer experience

Previously, if you wanted to reference a notebook in another notebook, you could only reference published or committed content. Now, when using %run notebooks, you can enable 'unpublished notebook reference' which will allow you to reference unpublished notebooks. When enabled, notebook run will fetch the current contents in the notebook web cache, meaning the changes in your notebook editor can be referenced immediately by other notebooks without having to be published (Live mode). To learn more, read Reference unpublished notebook.

Mar 2022 update

The following updates are new to Azure Synapse Analytics this month.

Developer Experience

Code cells in Synapse notebooks that result in exception will now show standard output along with the exception message. This feature is supported for Python and Scala languages. To learn more, see the example output when a code statement fails.
Synapse notebooks now support partial output when running code cells. To learn more, see the examples at this blog post
You can now dynamically control Spark session configuration for the notebook activity with pipeline parameters. To learn more, see the variable explorer feature of Synapse notebooks.
You can now reuse and manage notebook sessions without having to start a new one. You can easily connect a selected notebook to an active session in the list started from another notebook. You can detach a session from a notebook, stop the session, and monitor it. To learn more, see how to manage your active notebook sessions.
Synapse notebooks now capture anything written through the Python logging module, in addition to the driver logs. To learn more, see support for Python logging.

SQL

Column Level Encryption for Azure Synapse dedicated SQL Pools is now Generally Available. With column level encryption, you can use different protection keys for each column with each key having its own access permissions. The data in CLE-enforced columns are encrypted on disk and remain encrypted in memory until the DECRYPTBYKEY function is used to decrypt it. To learn more, see how to encrypt a data column.
Serverless SQL pools now support better performance for CETAS (Create External Table as Select) and subsequent SELECT queries. The performance improvements include, a parallel execution plan resulting in faster CETAS execution and outputting multiple files. To learn more, see CETAS with Synapse SQL article and the blog post

Apache Spark for Synapse

Synapse Spark Common Data Model (CDM) Connector is now Generally Available. The CDM format reader/writer enables a Spark program to read and write CDM entities in a CDM folder via Spark dataframes. To learn more, see how the CDM connector supports reading, writing data, examples, & known issues.
Synapse Spark Dedicated SQL Pool (DW) Connector now supports improved performance. The new architecture eliminates redundant data movement and uses COPY-INTO instead of PolyBase. You can authenticate through SQL basic authentication or opt into the Azure Active Directory/Azure AD based authentication method. It now has ~5x improvements over the previous version. To learn more, see Azure Synapse Dedicated SQL Pool Connector for Apache Spark
Synapse Spark Dedicated SQL Pool (DW) Connector now supports all Spark Dataframe SaveMode choices. It supports Append, Overwrite, ErrorIfExists, and Ignore modes. The Append and Overwrite are critical for managing data ingestion at scale. To learn more, see DataFrame write SaveMode support
Accelerate Spark execution speed using the new Intelligent Cache feature. This feature is currently in public preview. Intelligent Cache automatically stores each read within the allocated cache storage space, detecting underlying file changes and refreshing the files to provide the most recent data. To learn more, see how to Enable/Disable the cache for your Apache Spark pool or see the blog post

Security

Azure Synapse Analytics now supports Azure Active Directory (Azure AD) authentication. You can turn on Azure AD authentication during the workspace creation or after the workspace is created. To learn more, see how to use Azure AD authentication with Synapse SQL.
API support to raise or lower minimal TLS version for workspace managed SQL Server Dedicated SQL. To learn more, see how to update the minimum TLS setting or read the blog post for more details.

Data Integration

Flowlets and CDC Connectors are now Generally Available. Flowlets in Synapse Data Flows allow for reusable and composable ETL logic. To learn more, see Flowlets in mapping data flow or see the blog post.
sFTP connector for Synapse data flows. You can read and write data while transforming data from sftp using the visual low-code data flows interface in Synapse. To learn more, see source transformation
Data flow improvements to Data Preview. To learn more, see Data Preview and debug improvements in Mapping Data Flows
Pipeline script activity. The Script Activity enables data engineers to build powerful data integration pipelines that can read from and write to Synapse databases, and other database types. To learn more, see Transform data by using the Script activity in Azure Data Factory or Synapse Analytics

Feb 2022 update

The following updates are new to Azure Synapse Analytics this month.

SQL

Serverless SQL Pools now support more consistent query execution times. Learn how Serverless SQL pools automatically detect spikes in read latency and support consistent query execution time.
The OPENJSON function makes it easy to get array element indexes. To learn more, see how the OPENJSON function in a serverless SQL pool allows you to parse nested arrays and return one row for each JSON array element with the index of each element.

Data integration

Upserting data is now supported by the copy activity. See how you can natively load data into a temporary table and then merge that data into a sink table with upsert.
Transform Dynamics Data Visually in Synapse Data Flows. Learn more on how to use a Dynamics dataset or an inline dataset as source and sink types to transform data at scale.
Connect to your SQL sources in data flows using Always Encrypted. To learn more, see how to securely connect to your SQL databases from Synapse data flows using Always Encrypted.
Capture descriptions from asserts in Data Flows To learn more, see how to define your own dynamic descriptive messages in the assert data flow transformation at the row or column level.
Easily define schemas for complex type fields. To learn more, see how you can make the engine to automatically detect the schema of an embedded complex field inside a string column.

Jan 2022 update

The following updates are new to Azure Synapse Analytics this month.

Machine Learning

Improvements to the Synapse Machine Learning library v0.9.5 (previously called MMLSpark). This release simplifies the creation of massively scalable machine learning pipelines with Apache Spark. To learn more, read the blog post about the new capabilities in this release or see the full release notes

Security

The Azure Synapse Analytics security overview - A whitepaper that covers the five layers of security. The security layers include authentication, access control, data protection, network security, and threat protection. Understand each security feature in detailed to implement an industry-standard security baseline and protect your data on the cloud.
TLS 1.2 is now required for newly created Synapse Workspaces. To learn more, see how TLS 1.2 provides enhanced security using this article or the blog post. Sign-in attempts to a newly created Synapse workspace from connections using TLS versions lower than 1.2 will fail.

Data Integration

Data quality validation rules using Assert transformation - You can now easily add data quality, data validation, and schema validation to your Synapse ETL jobs by using Assert transformation in Synapse data flows. To learn more, see the Assert transformation in mapping data flow article or the blog post.
Native data flow connector for Dynamics - Synapse data flows can now read and write data directly to Dynamics through the new data flow Dynamics connector. Learn more on how to Create data sets in data flows to read, transform, aggregate, join, etc. using this article or the blog post. You can then write the data back into Dynamics using the built-in Synapse Spark compute.
IntelliSense and auto-complete added to pipeline expressions - IntelliSense makes creating expressions, editing them easy. To learn more, see how to check your expression syntax, find functions, and add code to your pipelines.

Synapse SQL

COPY schema discovery for complex data ingestion. To learn more, see the blog post or how GitHub leveraged this functionality in Introducing Automatic Schema Discovery with auto table creation for complex datatypes.
Serverless SQL pools now support the HASHBYTES function. HASHBYTES is a T-SQL function, which hashes values. Learn how to use hash values in distributing data using this article or the blog post.

December 2021 update

The following updates are new to Azure Synapse Analytics this month.

Apache Spark for Synapse

Mount remote storage to a Synapse Spark pool blog article
Natively read & write data in ADLS with Pandas blog article
Dynamic allocation of executors for Spark blog article

Machine Learning

The Synapse Machine Learning library blog article
Getting started with state-of-the-art pre-built intelligent models blog article
Building responsible AI systems with the Synapse ML library blog article
PREDICT is now GA for Synapse Dedicated SQL pools blog article
Simple & scalable scoring with PREDICT and MLFlow for Apache Spark for Synapse blog article
Retail AI solutions blog article

Security

User-Assigned managed identities now supported in Synapse Pipelines in preview blog article
Browse ADLS Gen2 folders in an Azure Synapse Analytics workspace in preview blog article

Data Integration

Pipeline Fail activity blog article
Mapping Data Flow gets new native connectors blog article
More notebook export formats: HTML, Python, and LaTeX blog
Three new chart types in notebook view: box plot, histogram, and pivot table blog
Reconnect to lost notebook session blog

Integrate

Azure Synapse Link for Dataverse blog article
Custom partitions for Azure Synapse Link for Azure Cosmos DB in preview blog article
Map data tool (Public Preview), a no-code guided ETL experience blog article
Quick reuse of spark cluster blog article
External Call transformation blog article
Flowlets (Public Preview) blog article

November 2021 update

The following updates are new to Azure Synapse Analytics this month.

Work with Databases and Data Lakes

Introducing Lake databases (formerly known as Spark databases) blog article
Lake database designer now available in preview blog article

SQL

Delta Lake support for serverless SQL is generally available blog article
Query multiple file paths using OPENROWSET in serverless SQL blog article
Serverless SQL queries can now return up to 200 GB of results blog article
Handling invalid rows with OPENROWSET in serverless SQL blog article

Apache Spark for Synapse

Mount remote storage to a Synapse Spark pool blog article
Natively read & write data in ADLS with Pandas blog article
Dynamic allocation of executors for Spark blog article

Machine Learning

The Synapse Machine Learning library blog article
Getting started with state-of-the-art pre-built intelligent models blog article
Building responsible AI systems with the Synapse ML library blog article
PREDICT is now GA for Synapse Dedicated SQL pools blog article
Simple & scalable scoring with PREDICT and MLFlow for Apache Spark for Synapse blog article
Retail AI solutions blog article

Security

User-Assigned managed identities now supported in Synapse Pipelines in preview blog article
Browse ADLS Gen2 folders in an Azure Synapse Analytics workspace in preview blog article

Data Integration

Pipeline Fail activity blog article
Mapping Data Flow gets new native connectors blog article

Azure Synapse Link

Azure Synapse Link for Dataverse blog article
Custom partitions for Azure Synapse Link for Azure Cosmos DB in preview blog article

October 2021 update

The following updates are new to Azure Synapse Analytics this month.

Apache Spark for Synapse

Spark performance optimizations blog

Security

All Synapse RBAC roles are now generally available for use in production blog article
Apply User-Assigned Managed Identities for Double Encryption blog article
Synapse Administrators now have elevated access to dedicated SQL pools blog article

Integrate

Use Stringify in data flows to easily transform complex data types to strings blog article
Control Spark session time-to-live (TTL) in data flows blog article

Developer Experience

Enhanced Markdown editing in Synapse notebooks preview blog article
Pandas dataframes automatically render as nicely formatted HTML tables blog article
Use IPython widgets in Synapse Notebooks blog article
Mssparkutils runtime context now available for Python and Scala blog article

Next steps

Get started with Azure Synapse Analytics

Last updated on 2024-10-11

What's New in Azure Synapse Analytics Archive

Generally available features

Apache Spark for Azure Synapse Analytics

Data integration

Database Designer

Developer experience

Machine Learning

Samples and guidance

Security

Azure Synapse Link

Synapse SQL

Previous monthly updates in Azure Synapse Analytics

June 2022 update

General

SQL

Data integration

May 2022 update

SQL

Apache Spark for Synapse

Data Integration

Azure Synapse Link

Apr 2022 update

SQL

Synapse database designer

Apache Spark for Synapse

Security

Data integration

Developer experience

Mar 2022 update

Developer Experience

SQL

Apache Spark for Synapse

Security

Data Integration

Feb 2022 update

SQL

Data integration

Jan 2022 update

Machine Learning

Security

Data Integration

Synapse SQL

December 2021 update

Apache Spark for Synapse

Machine Learning

Security

Data Integration

Integrate

November 2021 update

Work with Databases and Data Lakes

SQL

Apache Spark for Synapse

Machine Learning

Security

Data Integration

Azure Synapse Link

October 2021 update

Apache Spark for Synapse

Security

Integrate

Developer Experience

Next steps

Additional resources