2019 年 1 月January 2019

这些功能和 Azure Databricks 平台的改进已于 2019 年 1 月发布。These features and Azure Databricks platform improvements were released in January 2019.

备注

发布分阶段进行。Releases are staged. 在初始发布日期后,可能最长需要等待一周,你的 Azure Databricks 帐户才会更新。Your Azure Databricks account may not be updated until up to a week after the initial release date.

即将发生的更改:Python 3 将成为创建群集时的默认版本Upcoming change: Python 3 to become the default when you create clusters

2019 年 1 月 29 日January 29, 2019

当 Databricks 平台版本 2.91 在二月中旬发布时,新群集的默认 Python 版本将从 Python 2 切换到 Python 3。When Databricks platform version 2.91 releases in mid-February, the default Python version for new clusters will switch from Python 2 to Python 3. 现有群集当然不会更改其 Python 版本。Existing clusters will not change their Python versions, of course. 但如果你习惯在新建群集时将 Python 2 作为默认版本,则需开始注意 Python 版本的选择。But if you’ve been in the habit of taking the Python 2 default when you create new clusters, you’ll need to start paying attention to your Python version selection.

默认 Python 版本Default Python version

现已发布用于机器学习的 Databricks Runtime 5.2 (Beta)Databricks Runtime 5.2 for Machine Learning (Beta) release

2019 年 1 月 24 日January 24, 2019

Databricks Runtime 5.2 ML 是基于 Databricks Runtime 5.2(不受支持)构建的。Databricks Runtime 5.2 ML is built on top of Databricks Runtime 5.2 (Unsupported). 它包含许多常见的机器学习库,包括 TensorFlow、PyTorch、Keras 和 XGBoost,并使用 Horovod 提供分布式 TensorFlow 训练。It contains many popular machine learning libraries, including TensorFlow, PyTorch, Keras, and XGBoost, and provides distributed TensorFlow training using Horovod. 除了自 Databricks Runtime ML 5.1 以来的库更新外,Databricks Runtime 5.2 ML 还包括以下新功能:In addition to library updates since Databricks Runtime ML 5.1, Databricks Runtime 5.2 ML includes the following new features:

  • 随着 Databricks 的性能优化,GraphFrames 现在支持 Pregel API (Python)。GraphFrames now supports the Pregel API (Python) with Databricks’s performance optimizations.
  • HorovodRunner 添加了以下功能:HorovodRunner adds:
    • 在 GPU 群集上,训练过程映射到 GPU 而不是工作器节点,以简化对多 GPU 实例类型的支持。On a GPU cluster, training processes are mapped to GPUs instead of worker nodes to simplify the support of multi-GPU instance types. 利用此内置支持,你可以分发到多 GPU 计算机上的所有 GPU 而无需使用自定义代码。This built-in support allows you to distribute to all of the GPUs on a multi-GPU machine without custom code.
    • HorovodRunner.run() 现在返回来自第一个训练过程的返回值。HorovodRunner.run() now returns the return value from the first training process.

请参阅 Databricks Runtime 5.2 ML(Beta 版本)的完整发行说明。See the complete release notes for Databricks Runtime 5.2 ML (Beta). dd

现已发布 Databricks Runtime 5.2Databricks Runtime 5.2 release

2019 年 1 月 24 日January 24, 2019

Databricks Runtime 5.2 现已推出。Databricks Runtime 5.2 is now available. Databricks Runtime 5.2 包括 Apache Spark 2.4.0、新增的 Delta Lake 和结构化流式处理功能与升级,以及已升级的 Python、R、Java 和 Scala 库。Databricks Runtime 5.2 includes Apache Spark 2.4.0, new Delta Lake and Structured Streaming features and upgrades, and upgraded Python, R, Java, and Scala libraries. 有关详细信息,请参阅 Databricks Runtime 5.2(不受支持)For details, see Databricks Runtime 5.2 (Unsupported).

群集配置 JSON 视图Cluster configuration JSON view

2019 年 1 月 15 日 - 22 日January 15-22, 2019

群集配置页现在支持 JSON 视图:The cluster configuration page now supports a JSON view:

群集配置 JSONCluster configuration JSON

JSON 视图为只读视图。The JSON view is read-only. 不过,你可以复制 JSON,并将它与群集 API 配合使用,以便创建和更新群集。However, you can copy the JSON and use it to create and update clusters with the Clusters API.

群集 UICluster UI

2019 年 1 月 15 日 -22 日:版本 2.89January 15-22, 2019: Version 2.89

已清理“群集创建”页,并对其进行重新组织以方便使用,其中包括新的“高级选项”开关。The cluster creation page has been cleaned up and reorganized for ease of use, including a new Advanced Options toggle.

群集配置Cluster configuration

在自己的 Azure 虚拟网络中部署 Azure Databricks(VNet 注入)Deploy Azure Databricks in your own Azure virtual network (VNet injection)

2019 年 1 月 10 日January 10, 2019

重要

此功能目前以公共预览版提供。This feature is in Public Preview.

Azure Databricks 的默认部署是 Azure 上的完全托管服务:所有数据平面资源(包括与所有群集关联的虚拟网络 (VNet))都部署到锁定的资源组。The default deployment of Azure Databricks is a fully managed service on Azure: all data plane resources, including a virtual network (VNet) that all clusters will be associated with, are deployed to a locked resource group. 但如果需要进行网络自定义,那么现在可以在自己的虚拟网络中部署 Azure Databricks(有时称为 VNet 注入),以便能够:If you require network customization, however, you can now deploy Azure Databricks in your own virtual network (sometimes called VNet injection), enabling you to:

通过将 Azure Databricks 部署到自己的虚拟网络中,还可以利用灵活的 CIDR 范围(虚拟网络的 CIDR 范围在 /16-/24 之间,子网的 CIDR 范围在 /18-/26 之间)。Deploying Azure Databricks to your own virtual network also lets you take advantage of flexible CIDR ranges (anywhere between /16-/24 for the virtual network and between /18-/26 for the subnets).

使用 Azure 门户 UI 进行配置非常快捷:在创建工作区时,只需选择“在虚拟网络中部署 Azure Databricks 工作区”,然后选择虚拟网络,并提供两个子网的 CIDR 范围。Configuration using the Azure portal UI is quick and easy: when you create a workspace, just select Deploy Azure Databricks workspace in your Virtual Network, select your virtual network, and provide CIDR ranges for two subnets. Azure Databricks 使用你提供的 CIDR 范围内的两个新的子网和网络安全组更新虚拟网络,将入站和出站子网流量列入允许列表,并将工作区部署到更新的虚拟网络。Azure Databricks updates the virtual network with two new subnets and network security groups using CIDR ranges provided by you, whitelists inbound and outbound subnet traffic, and deploys the workspace to the updated virtual network.

工作区部署中的 VNet 注入VNet injection on workspace deployment

如果希望自行配置用于 VNet 注入的虚拟网络(例如,你想要使用现有子网、使用现有网络安全组,或者创建自己的安全规则),则可以使用 Azure Databricks 提供的 ARM 模板(而不是门户 UI)。If you prefer to configure the virtual network for VNet injection yourself—for example, you want to use existing subnets, use existing network security groups, or create your own security rules—you can use Azure-Databricks-supplied ARM templates instead of the portal UI.

备注

此功能以前仅通过注册提供。This feature was previously available by enrollment only. 它仍处于 预览 阶段,但现在完全是自助服务。It remains in Preview but is now fully self-service.

有关详细信息,请参阅在 Azure 虚拟网络中部署 Azure Databricks(VNet 注入)将 Azure Databricks 工作区连接到本地网络For details, see Deploy Azure Databricks in your Azure virtual network (VNet injection) and Connect your Azure Databricks Workspace to your on-premises network.

库 UILibrary UI

2019 年 1 月 2日 - 9 日:版本 2.88January 2-9, 2019: Version 2.88

最初在 2018 年 11 月发布并随后很快还原的库 UI 改进功能已经重新发布。The library UI improvements that were originally released in November 2018 and reverted shortly thereafter have been re-released. 这些更新使你可以更轻松地上传、安装和管理用于 Azure Databricks 群集的库。These updates make it easier to upload, install, and manage libraries for your Azure Databricks clusters.

Azure Databricks UI 现在支持工作区库和群集安装的库。The Azure Databricks UI now supports both workspace libraries and cluster-installed libraries. 工作区库存在于工作区中,可安装到一个或多个群集上。A workspace library exists in the Workspace and can be installed on one or more clusters. 群集安装的库是只存在于其安装到的群集的上下文中的库。A cluster-installed library is a library that exists only in the context of the cluster that it is installed on. 此外:In addition:

  • 现在可以从上传到对象存储的文件创建库。You can now create a library from a file uploaded to object storage.
  • 现在可以从库详细信息页和群集的“库”选项卡中安装和卸载库。You can now install and uninstall libraries from the library details page and a cluster’s Libraries tab.
  • 现在,使用 API 安装的库会在群集的“库”选项卡中显示。Libraries installed using the API now display in a cluster’s Libraries tab.

有关详细信息,请参阅For details, see Libraries.

群集事件Cluster Events

2019 年 1 月 2日 - 9 日:版本 2.88January 2-9, 2019: Version 2.88

已添加了新的群集事件以反映 Spark 驱动程序状态。New cluster events were added to reflect Spark driver status. 有关详细信息,请参阅 ClusterEventTypeFor details, see ClusterEventType.

使用 Azure DevOps Services 控制笔记本版本Notebook Version Control using Azure DevOps Services

2019 年 1 月 2日 - 9 日:版本 2.88January 2-9, 2019: Version 2.88

利用 Azure Databricks,现在可以轻松地使用 Azure DevOps Services(以前称为 VSTS)对笔记本进行版本控制。Azure Databricks now makes it easy to use Azure DevOps Services (formerly VSTS) to version-control your notebooks. 身份验证是自动的,设置过程非常简单,并且你可以像通过 GitHub 集成管理笔记本修订版一样来管理笔记本修订版。Authentication is automatic, setup is straightforward, and you manage your notebook revisions just like you do with our GitHub integration.

有关详细信息,请参阅 Azure DevOps Services 版本控制For details, see Azure DevOps Services version control.