2018 年 11 月November 2018

这些功能和 Azure Databricks 平台的改进已于 2018 年 11 月发布。These features and Azure Databricks platform improvements were released in November 2018.

备注

发布分阶段进行。Releases are staged. 在初始发布日期后,可能最长需要等待一周,你的 Azure Databricks 帐户才会更新。Your Azure Databricks account may not be updated until up to a week after the initial release date.

库 UI Library UI

重要

此更新已于 2018 年 12 月 7 日还原。This update was reverted on December 7, 2018.

2018 年 11 月 27 日- 12 月 4 日:版本 2.85November 27-December 4, 2018: Version 2.85

在此版本中,库 UI 已经得到了显著改进。In this release, the library UI has been significantly improved.

Azure Databricks UI 现在支持工作区库和群集附加的库。The Azure Databricks UI now supports workspace libraries and cluster-attached libraries. 工作区库存在于工作区中,可附加到一个或多个群集。A workspace library exists in the Workspace and can be attached to one or more clusters. 群集附加库是只存在于其附加到的群集的上下文中的库。A cluster-attached library is a library that exists only in the context of the cluster that it is attached to. 此外:In addition:

  • 现在可以从上传到对象存储的文件创建库。You can now create a library from a file uploaded to object storage.
  • 现在可以从库详细信息页和群集的“库”选项卡中附加和分离库。You can now attach and detach libraries from the library details page and a cluster’s Libraries tab.
  • 现在,使用 API 安装的库会在群集的“库”选项卡中显示。Libraries installed using the API now display in a cluster’s Libraries tab.

已启用自定义 Spark 堆内存设置Custom Spark heap memory settings enabled

2018 年 11 月 27 日- 12 月 4 日:版本 2.85November 27-December 4, 2018: Version 2.85

以下 Spark 内存设置现在生效:The following Spark memory settings now take effect:

  • spark.executor.memory
  • spark.driver.memory

重要

  • Azure Databricks 具有在每个节点上运行的服务,因此 Spark 所允许的最大内存小于云服务提供商报告的 VM 的内存容量。Azure Databricks has services running on each node so the maximum allowable memory for Spark is less than the memory capacity of the VM reported by the cloud provider. 如果要为 Spark 提供执行程序或驱动程序的最大堆内存,请勿指定 spark.executor.memoryspark.driver.memoryIf you want to provide Spark with the maximum amount of heap memory for the executor or driver, don’t specify spark.executor.memory or spark.driver.memory respectively.
  • 某些以前无效但被忽略的群集配置可能导致群集故障。Some cluster configurations that were previously invalid but ignored may result in cluster failures.

作业和空闲执行上下文逐出Jobs and idle execution context eviction

2018 年 11 月 27 日- 12 月 4 日:版本 2.85November 27-December 4, 2018: Version 2.85

作业现在自动逐出空闲执行上下文。Jobs now auto-evict idle execution contexts. 请参阅执行上下文See Execution contexts. 若要最大程度地减少自动逐出,Azure Databricks 建议为作业和交互式工作负载使用不同的群集。To minimize auto-eviction, Azure Databricks recommends that you use different clusters for jobs and interactive workloads.

现已发布用于机器学习的 Databricks Runtime 5.0 (Beta)Databricks Runtime 5.0 for Machine Learning (Beta) release

2018 年 11 月 19 日November 19, 2018

Databricks Runtime 5.0 ML (Beta) 为机器学习和数据科学提供了随时可用的环境。Databricks Runtime 5.0 ML (Beta) provides a ready-to-go environment for machine learning and data science. 它包含多个热门库,其中包括 TensorFlow、Keras 和 XGBoost。It contains multiple popular libraries, including TensorFlow, Keras, and XGBoost. 它还支持使用 Horovod 进行分布式 TensorFlow 训练。It also supports distributed TensorFlow training using Horovod. Databricks Runtime 5.0 ML 是基于 Databricks Runtime 5.0 构建的。Databricks Runtime 5.0 ML is built on top of Databricks Runtime 5.0. Databricks Runtime 5.0 ML 包括以下新功能:Databricks Runtime 5.0 ML includes the following new features:

请参阅 Databricks Runtime 5.0 ML(Beta 版本)的完整发行说明。See the complete release notes for Databricks Runtime 5.0 ML (Beta).

现已发布 Databricks Runtime 5.0Databricks Runtime 5.0 release

2018 年 11 月 8 日November 8, 2018

Databricks Runtime 5.0 现已推出。Databricks Runtime 5.0 is now available. Databricks Runtime 5.0 包括 Apache Spark 2.4.0、新的 Delta Lake 和结构化流功能和升级以及已升级的 Python、R、Java 和 Scala 库。Databricks Runtime 5.0 includes Apache Spark 2.4.0, new Delta Lake and Structured Streaming features and upgrades, and upgraded Python, R, and Java and Scala libraries. 有关详细信息,请参阅 Databricks Runtime 5.0(不受支持)For details, see Databricks Runtime 5.0 (Unsupported).

在 Databricks Runtime 5.0 上,一旦群集达到最大上下文限制 (145),Azure Databricks 立即逐出空闲执行上下文。On Databricks Runtime 5.0, Azure Databricks now evicts idle execution contexts once a cluster has reached the maximum context limit (145). 请参阅执行上下文See Execution contexts.

displayHTML 支持第三方内容的无限制加载displayHTML support for unrestricted loading of third-party content

2018 年 11 月 6 日 - 13 日:版本 2.84November 6-13, 2018: Version 2.84

以前,displayHTML iframe 沙盒缺失 allow-same-origin 属性。Previously the displayHTML iframe sandbox was missing the allow-same-origin attribute. 这意味着 iframe 的来源为空,这并不适合跨源 XHR 请求、cookie 或访问嵌入的 iframe。This meant that the iframe had a null origin, which wasn’t friendly to cross-origin XHR requests, cookies, or accessing embedded iframes. 在此版本中,displayHTML iframe 是从新域 databricksusercontent.com 提供的,iframe 沙盒现在包含 allow-same-origin 属性。With this release, the displayHTML iframe is served from a new domain, databricksusercontent.com, and the iframe sandbox now includes the allow-same-origin attribute.

如果你已在使用 displayHTML,则无需改变对 displayHTML 的使用。There is no need to change your usage of displayHTML if it’s already working for you.

需要可在浏览器中访问 databricksusercontent.comdatabricksusercontent.com will need to be accessible from your browser. 如果它当前被企业网络阻止,IT 人员需要将它加入允许列表。If it is currently blocked by your corporate network, it will need to be whitelisted by IT.