April 2019

These features and Azure Databricks platform improvements were released in April 2019.

Note

The release date and content listed below only corresponds to actual deployment of the Azure Public Cloud in most case.

It provide the evolution history of Azure Databricks service on Azure Public Cloud for your reference that may not be suitable for Azure operated by 21Vianet.

Note

Releases are staged. Your Azure Databricks account may not be updated until up to a week after the initial release date.

MLflow on Azure Databricks (GA)

April 25, 2019

Managed MLflow on Azure Databricks is now generally available. MLflow on Azure Databricks offers a hosted version of MLflow fully integrated with the Databricks security model and interactive workspace. See ML lifecycle management using MLflow.

Delta Lake on Azure Databricks

April 24, 2019

Databricks has open sourced the Delta Lake project. Delta Lake is a storage layer that brings reliability to data lakes built on HDFS and cloud storage by providing ACID transactions through optimistic concurrency control between writes and snapshot isolation for consistent reads during writes. Delta Lake also provides built-in data versioning for easy rollbacks and reproducing reports.

Note

What was previously called Databricks Delta is now the Delta Lake open source project plus optimizations available on Azure Databricks. See What is Delta Lake?.

MLflow runs sidebar

April 9 - 16, 2019: Version 2.95

You can now view the MLflow runs and the notebook revisions that produced these runs in a sidebar next to your notebook. In the notebook's right sidebar, click the Experiment icon Experiment icon.

See Create notebook experiment.

Access Azure Data Lake Storage Gen2 automatically with your Microsoft Entra ID credentials (GA)

April 9 - 16, 2019: Version 2.95

We are pleased to announce the general availability of automatic authentication to Azure Data Lake Storage Gen2 from Azure Databricks clusters using the same Microsoft Entra ID identity that you use to log into Azure Databricks.

Simply enable your cluster for Microsoft Entra ID credential passthrough, and commands that you run on that cluster will be able to read and write your data in Azure Data Lake Storage Gen2 without requiring you to configure service principal credentials for access to storage.

For more information, see Access Azure Data Lake Storage using Microsoft Entra ID credential passthrough (legacy).

Databricks Runtime 5.3 (GA)

April 3, 2019

Databricks Runtime 5.3 is now generally available. Databricks Runtime 5.3 includes new Delta Lake features and upgrades, and upgraded Python, R, Java, and Scala libraries.

Major upgrades include:

  • Databricks Delta time travel GA
  • MySQL table replication to Delta, Public Preview
  • Optimized DBFS FUSE folder for deep learning workloads
  • Notebook-scoped library improvements
  • New Databricks Advisor hints

For details, see Databricks Runtime 5.3 (EoS).

Databricks Runtime 5.3 ML (GA)

April 3, 2019

With Databricks Runtime 5.3 for Machine Learning, we have achieved our first GA of Databricks Runtime ML! Databricks Runtime ML provides a ready-to-go environment for machine learning and data science. It builds on Databricks Runtime and adds many popular machine learning libraries, including TensorFlow, PyTorch, Keras, and XGBoost. It also supports distributed training using Horovod.

This version is built on Databricks Runtime 5.3, with additional libraries, some different library versions, and Conda package management for Python libraries. Major new features since Databricks Runtime 5.2 ML Beta include:

  • MLlib integration with MLflow (Private Preview), which provides automatic logging of MLflow runs for models fit using the PySpark tuning algorithms CrossValidator and TrainValidationSplit.

    If you want to participate in the preview, contact your Databricks account team.

  • Upgrades to the PyArrow, Horovod, and TensorboardX libraries.

    The PyArrow update adds the ability to use BinaryType when you perform Arrow-based conversion and makes it available in pandas UDF.

For more information, see Databricks Runtime 5.3 ML (EoS). For instructions on creating a Databricks Runtime ML cluster, see AI and machine learning on Databricks.