July 2020

These features and Azure Databricks platform improvements were released in July 2020.

Note

The release date and content listed below only corresponds to actual deployment of the Azure Public Cloud in most case.

It provide the evolution history of Azure Databricks service on Azure Public Cloud for your reference that may not be consistent with the actual deployment on Azure operated by 21Vianet.

Note

Releases are staged. Your Azure Databricks account may not be updated until up to a week after the initial release date.

Web terminal (Public Preview)

July 29-Aug 4, 2020: Version 3.25

Web terminal provides a convenient and highly interactive way for users with CAN ATTACH TO permission on a cluster to run shell commands, including editors such as Vim or Emacs. Example uses of the web terminal include monitoring resource usage and installing Linux packages.

For details, see Run shell commands in Azure Databricks web terminal.

New, more secure global init script framework (Public Preview)

July 29 - August 4, 2020: Version 3.25

The new global init script framework brings significant improvements over legacy global init scripts:

Init scripts are more secure, requiring admin permissions to create, view, and delete.
Script-related launch failures are logged.
You can set the execution order of multiple init scripts.
Init scripts can reference cluster-related environment variables.
Init scripts can be created and managed using the admin settings page or the new Global Init Scripts REST API.

Databricks recommends that you migrate existing legacy global init scripts to the new framework to take advantage of these improvements.

For details, see Global init scripts.

IP access lists now GA

July 29 - August 4, 2020: Version 3.25

The IP Access List API is now generally available.

The GA version includes one change, which is the renaming of the list_type values:

WHITELIST to ALLOW
BLACKLIST to BLOCK

Use the IP Access List API to configure your Azure Databricks workspaces so that users connect to the service only through existing corporate networks with a secure perimeter. Azure Databricks admins can use the IP Access List API to define a set of approved IP addresses, including allow and block lists. All incoming access to the web application and REST APIs requires that the user connect from an authorized IP address, guaranteeing that workspaces cannot be accessed from a public network like a coffee shop or an airport unless your users use VPN.

This feature requires the Premium plan.

For more information, see Configure IP access lists for workspaces.

New file upload dialog

July 29 - August 4, 2020: Version 3.25

You can now upload small tabular data files (like CSVs) and access them from a notebook by selecting Add data from the notebook File menu. Generated code shows you how to load the data into Pandas or DataFrames. Admins can disable this feature on the Admin Console Advanced tab.

For more information, see Browse files in DBFS.

SCIM API filter and sort improvements

July 29 - Aug 4, 2020: Version 3.25

The SCIM API now includes these filtering and sorting improvements:

Admin users can filter users on the active attribute.
All users can sort results using the sortBy and sortOrder query parameters. The default is to sort by ID.

Azure Government regions added

July 25, 2020

Azure Databricks recently became available in the US Gov Arizona and US Gov Virginia regions for US government entities and their partners.

Databricks Runtime 7.1 GA

July 21, 2020

Databricks Runtime 7.1 brings many additional features and improvements over Databricks Runtime 7.0, including:

Google BigQuery connector
%pip commands to manage Python libraries installed in a notebook session
Koalas installed
Many Delta Lake improvements, including:
- Setting user-defined commit metadata
- Getting the version of the last commit written by the current SparkSession
- Converting Parquet tables created by Structured Streaming using the _spark_metadata transaction log
- MERGE INTO performance improvements

For details, see the complete Databricks Runtime 7.1 (EoS) release notes.

Databricks Runtime 7.1 ML GA

July 21, 2020

Databricks Runtime 7.1 for Machine Learning is built on top of Databricks Runtime 7.1 and brings the following new features and library changes:

pip and conda magic commands enabled by default
spark-tensorflow-distributor: 0.1.0
pillow 7.0.0 -> 7.1.0
pytorch 1.5.0 -> 1.5.1
torchvision 0.6.0 -> 0.6.1
horovod 0.19.1 -> 0.19.5
mlflow 1.8.0 -> 1.9.1

For details, see the complete Databricks Runtime 7.1 for ML (EoS) release notes.

Databricks Runtime 7.1 Genomics GA

July 21, 2020

Databricks Runtime 7.1 for Genomics is built on top of Databricks Runtime 7.1 and brings the following new features:

LOCO transformation
GloWGR output reshaping function
RNASeq outputs unpaired alignments