Databricks Runtime 7.0 MLDatabricks Runtime 7.0 ML

Databricks 于 2020 年 6 月发布了此映像。Databricks released this image in June 2020.

Databricks Runtime 7.0 ML 基于 Databricks Runtime 7.0,为机器学习和数据科学提供了随时可用的环境。Databricks Runtime 7.0 ML provides a ready-to-go environment for machine learning and data science based on Databricks Runtime 7.0. Databricks Runtime ML 包含许多常用的机器学习库,包括 TensorFlow、PyTorch 和 XGBoost。Databricks Runtime ML contains many popular machine learning libraries, including TensorFlow, PyTorch, and XGBoost. 它还支持使用 Horovod 进行分布式深度学习训练。It also supports distributed deep learning training using Horovod.

有关详细信息,包括有关创建 Databricks Runtime ML 群集的说明,请参阅用于机器学习的 Databricks RuntimeFor more information, including instructions for creating a Databricks Runtime ML cluster, see Databricks Runtime for Machine Learning.

新增功能和主要更改New features and major changes

Databricks Runtime 7.0 ML 是基于 Databricks Runtime 7.0 构建的。Databricks Runtime 7.0 ML is built on top of Databricks Runtime 7.0. 若要了解 Databricks Runtime 7.0 中的新增功能,包括 Apache Spark MLlib 和 SparkR,请参阅 Databricks Runtime 7.0 发行说明。For information on what’s new in Databricks Runtime 7.0, including Apache Spark MLlib and SparkR, see the Databricks Runtime 7.0 release notes.

GPU 感知型计划GPU-aware scheduling

Databricks Runtime 7.0 ML 支持 Apache Spark 3.0 中提供的 GPU 感知型计划。Databricks Runtime 7.0 ML supports GPU-aware scheduling from Apache Spark 3.0. Azure Databricks 会自动为你配置它。Azure Databricks automatically configures it for you. 请参阅 GPU 计划See GPU scheduling.

ML Python 环境的重大更改Major changes to ML Python environment

本部分介绍预安装的 ML Python 环境与 Databricks Runtime 6.6 ML 相比有哪些重大更改。This section describes the major changes to the pre-installed ML Python environment compared to Databricks Runtime 6.6 ML. 你还应该查看 Databricks Runtime 7.0 中基础 Python 环境的重大更改。You should also review the major changes to the base Python environment in Databricks Runtime 7.0. 如需查看已安装的 Python 包及其版本的完整列表,请参阅 Python 库For a full list of installed Python packages and their versions, see Python libraries.

升级的 Python 包Python packages upgraded

  • tensorflow 1.15.0 -> 2.2.0tensorflow 1.15.0 -> 2.2.0
  • tensorboard 1.15.0 -> 2.2.2tensorboard 1.15.0 -> 2.2.2
  • pytorch 1.4.0 -> 1.5.0pytorch 1.4.0 -> 1.5.0
  • xgboost 0.90 -> 1.1.1xgboost 0.90 -> 1.1.1
  • sparkdl 1.6.0-db1 -> 2.1.0-db1sparkdl 1.6.0-db1 -> 2.1.0-db1
  • hyperopt 0.2.2.db1 -> 0.2.4.db1hyperopt 0.2.2.db1 -> 0.2.4.db1

添加的 Python 包Python packages added

  • lightgbm:2.3.0lightgbm: 2.3.0
  • nltk:3.4.5nltk: 3.4.5
  • petastorm:0.9.2petastorm: 0.9.2
  • plotly:4.5.2plotly: 4.5.2

删除的 Python 包Python packages removed

  • argparseargparse
  • boto(改用 boto3boto (use boto3 instead)
  • coloramacolorama
  • 已弃用deprecated
  • et-xmlfileet-xmlfile
  • fusepyfusepy
  • html5libhtml5lib
  • jdcaljdcal
  • keras(改用 tensorflow.keraskeras (use tensorflow.keras instead)
  • keras-applications(改用 tensorflow.keras.applicationskeras-applications (use tensorflow.keras.applications instead)
  • llvmlitellvmlite
  • lxmllxml
  • nosenose
  • nose-excludenose-exclude
  • numbanumba
  • openpyxlopenpyxl
  • pathlib2pathlib2
  • plyply
  • pymongopymongo
  • singledispatchsingledispatch
  • tensorboardX(改用 torch.utils.tensorboardtensorboardX (use torch.utils.tensorboard instead)
  • virtualenvvirtualenv
  • webencodingswebencodings

ML R 环境的重大更改Major changes to ML R environment

Databricks Runtime 7.0 ML 包括 RStudio Server 开源版 v1.2.5033 的未修改版本,可以在 GitHub 中找到该版本的源代码。Databricks Runtime 7.0 ML includes an unmodified version of RStudio Server Open Source v1.2.5033 for which the source code can be found in GitHub. 请在 Azure Databricks 上了解有关 RStudio Server 的详细信息。Read more about RStudio Server on Azure Databricks.

对 ML Spark 包、Java 和 Scala 库的更改Changes to ML Spark packages, Java and Scala libraries

以下包已升级。The following packages are upgraded. 有些包已升级到与 Apache Spark 3.0 兼容的 SNAPSHOT 版本:Some are upgraded to SNAPSHOT releases that are compatible with Apache Spark 3.0:

  • graphframes:0.7.0-db1-spark2.4 -> 0.8.0-db2-spark3.0graphframes: 0.7.0-db1-spark2.4 -> 0.8.0-db2-spark3.0
  • spark-tensorflow-connector:1.15.0 (Scala 2.11) -> 1.15.0 (Scala 2.12)spark-tensorflow-connector: 1.15.0 (Scala 2.11) -> 1.15.0 (Scala 2.12)
  • xgboost4j and xgboost4j-spark:0.90 -> 1.0.0xgboost4j and xgboost4j-spark: 0.90 -> 1.0.0
  • mleap-databricks-runtime:0.17.0-4882dc3 (SNAPSHOT)mleap-databricks-runtime: 0.17.0-4882dc3 (SNAPSHOT)

以下包已移除:The following packages are removed:

  • TensorFlow (Java)TensorFlow (Java)
  • TensorFramesTensorFrames
  • 适用于 Apache Spark 的深度学习管道(HorovodRunner 提供 Python 版)Deep Learning Pipelines for Apache Spark (HorovodRunner is available in Python)

添加了 conda 和 pip 命令,可支持范笔记本范围的 Python 库(公共预览版)Added conda and pip commands to support notebook-scoped Python libraries (public preview)

从 Databricks Runtime 7.0 ML 开始,可使用 %pip%conda 命令来管理笔记本会话中安装的 Python 库。Starting with Databricks Runtime 7.0 ML, you can use %pip and %conda commands to manage Python libraries installed in a notebook session. 还可使用这些命令为笔记本创建自定义环境并在笔记本之间重现此环境。You can also use these commands to create a custom environment for a notebook and to reproduce this environment between notebooks. 若要启用此功能,请在群集设置中设置 Spark 配置 spark.databricks.conda.condaMagic.enabled trueTo enable this feature, in cluster settings, set the Spark configuration spark.databricks.conda.condaMagic.enabled true. 有关详细信息,请参阅笔记本范围的 Python 库For more information, see Notebook-scoped Python libraries.

弃用功能和不支持的功能Deprecations and unsupported features

Databricks Runtime 7.0 ML 不支持表访问控制Databricks Runtime 7.0 ML does not support table access control. 如果需要表访问控制,我们建议你使用 Databricks Runtime 7.0。If you need table access control, we recommend that you use Databricks Runtime 7.0.

已知问题Known issues

  • 由于一个 mleap API 更改,为了以 mleap 格式记录 MLlib 模型而将 sample_input 参数传递给mlflow.spark.log_model 的操作失败,并出现 AttributeError。Passing the sample_input argument to mlflow.spark.log_model in order to log an MLlib model in mleap format fails with an AttributeError due to an mleap API change. 一种解决方法是升级到 MLflow 1.9.0。Upgrade to MLflow 1.9.0 as a workaround. 你可以使用笔记本范围的 Python 库工作区库来安装 MLflow 1.9.0You can install MLflow 1.9.0 using Notebook-scoped Python libraries or Workspace Libraries

系统环境System environment

Databricks Runtime 7.0 ML 中的系统环境在以下方面不同于 Databricks Runtime 7.0:The system environment in Databricks Runtime 7.0 ML differs from Databricks Runtime 7.0 as follows:

  • DBUtils:Databricks Runtime ML 未包含库实用工具DBUtils: Databricks Runtime ML does not contain Library utilities. 可改用 %pip%conda 命令。You can use %pip and %conda commands instead. 请参阅作用域为笔记本的 Python 库See Notebook-scoped Python libraries.
  • 对于 GPU 群集,以下 NVIDIA GPU 库:For GPU clusters, the following NVIDIA GPU libraries:
    • CUDA 10.1 Update 2CUDA 10.1 Update 2
    • cuDNN 7.6.5cuDNN 7.6.5
    • NCCL 2.7.3NCCL 2.7.3
    • TensorRT 6.0.1TensorRT 6.0.1

Libraries

以下部分列出了 Databricks Runtime 7.0 ML 中包含的库,这些库不同于 Databricks Runtime 7.0 中包含的库。The following sections list the libraries included in Databricks Runtime 7.0 ML that differ from those included in Databricks Runtime 7.0.

本节内容:In this section:

顶层库Top-tier libraries

Databricks Runtime 7.0 ML 包含以下顶层Databricks Runtime 7.0 ML includes the following top-tier libraries:

Python 库Python libraries

Databricks Runtime 7.0 ML 使用 Conda 进行 Python 包管理,并且包含许多常用的 ML 包。Databricks Runtime 7.0 ML uses Conda for Python package management and includes many popular ML packages. 下一部分介绍用于 Databricks Runtime 7.0 ML 的 Conda 环境。The following section describes the Conda environment for Databricks Runtime 7.0 ML.

CPU 群集上的 PythonPython on CPU clusters

name: databricks-ml
channels:
  - pytorch
  - defaults
dependencies:
  - _libgcc_mutex=0.1=main
  - absl-py=0.9.0=py37_0
  - asn1crypto=1.3.0=py37_0
  - astor=0.8.0=py37_0
  - backcall=0.1.0=py37_0
  - backports=1.0=py_2
  - bcrypt=3.1.7=py37h7b6447c_1
  - blas=1.0=mkl
  - blinker=1.4=py37_0
  - boto3=1.12.0=py_0
  - botocore=1.15.0=py_0
  - c-ares=1.15.0=h7b6447c_1001
  - ca-certificates=2020.1.1=0
  - cachetools=4.1.0=py_1
  - certifi=2020.4.5.1=py37_0
  - cffi=1.14.0=py37h2e261b9_0
  - chardet=3.0.4=py37_1003
  - click=7.0=py37_0
  - cloudpickle=1.3.0=py_0
  - configparser=3.7.4=py37_0
  - cpuonly=1.0=0
  - cryptography=2.8=py37h1ba5d50_0
  - cycler=0.10.0=py37_0
  - cython=0.29.15=py37he6710b0_0
  - decorator=4.4.1=py_0
  - dill=0.3.1.1=py37_1
  - docutils=0.15.2=py37_0
  - entrypoints=0.3=py37_0
  - flask=1.1.1=py_1
  - freetype=2.9.1=h8a8886c_1
  - future=0.18.2=py37_1
  - gast=0.3.3=py_0
  - gitdb2=2.0.6=py_0
  - gitpython=3.0.5=py_0
  - google-auth=1.11.2=py_0
  - google-auth-oauthlib=0.4.1=py_2
  - google-pasta=0.2.0=py_0
  - grpcio=1.27.2=py37hf8bcb03_0
  - gunicorn=20.0.4=py37_0
  - h5py=2.10.0=py37h7918eee_0
  - hdf5=1.10.4=hb1b8bf9_0
  - icu=58.2=he6710b0_3
  - idna=2.8=py37_0
  - intel-openmp=2020.0=166
  - ipykernel=5.1.4=py37h39e3cac_0
  - ipython=7.12.0=py37h5ca1d4c_0
  - ipython_genutils=0.2.0=py37_0
  - itsdangerous=1.1.0=py37_0
  - jedi=0.14.1=py37_0
  - jinja2=2.11.1=py_0
  - jmespath=0.9.4=py_0
  - joblib=0.14.1=py_0
  - jpeg=9b=h024ee3a_2
  - jupyter_client=5.3.4=py37_0
  - jupyter_core=4.6.1=py37_0
  - kiwisolver=1.1.0=py37he6710b0_0
  - krb5=1.16.4=h173b8e3_0
  - ld_impl_linux-64=2.33.1=h53a641e_7
  - libedit=3.1.20181209=hc058e9b_0
  - libffi=3.2.1=hd88cf55_4
  - libgcc-ng=9.1.0=hdf63c60_0
  - libgfortran-ng=7.3.0=hdf63c60_0
  - libpng=1.6.37=hbc83047_0
  - libpq=11.2=h20c2e04_0
  - libprotobuf=3.11.4=hd408876_0
  - libsodium=1.0.16=h1bed415_0
  - libstdcxx-ng=9.1.0=hdf63c60_0
  - libtiff=4.1.0=h2733197_0
  - lightgbm=2.3.0=py37he6710b0_0
  - lz4-c=1.8.1.2=h14c3975_0
  - mako=1.1.2=py_0
  - markdown=3.1.1=py37_0
  - markupsafe=1.1.1=py37h7b6447c_0
  - matplotlib-base=3.1.3=py37hef1b27d_0
  - mkl=2020.0=166
  - mkl-service=2.3.0=py37he904b0f_0
  - mkl_fft=1.0.15=py37ha843d7b_0
  - mkl_random=1.1.0=py37hd6b4f25_0
  - ncurses=6.2=he6710b0_1
  - networkx=2.4=py_0
  - ninja=1.9.0=py37hfd86e86_0
  - nltk=3.4.5=py37_0
  - numpy=1.18.1=py37h4f9e942_0
  - numpy-base=1.18.1=py37hde5b4d6_1
  - oauthlib=3.1.0=py_0
  - olefile=0.46=py37_0
  - openssl=1.1.1g=h7b6447c_0
  - packaging=20.1=py_0
  - pandas=1.0.1=py37h0573a6f_0
  - paramiko=2.7.1=py_0
  - parso=0.5.2=py_0
  - patsy=0.5.1=py37_0
  - pexpect=4.8.0=py37_0
  - pickleshare=0.7.5=py37_0
  - pillow=7.0.0=py37hb39fc2d_0
  - pip=20.0.2=py37_3
  - plotly=4.5.2=py_0
  - prompt_toolkit=3.0.3=py_0
  - protobuf=3.11.4=py37he6710b0_0
  - psutil=5.6.7=py37h7b6447c_0
  - psycopg2=2.8.4=py37h1ba5d50_0
  - ptyprocess=0.6.0=py37_0
  - pyasn1=0.4.8=py_0
  - pyasn1-modules=0.2.7=py_0
  - pycparser=2.19=py37_0
  - pygments=2.5.2=py_0
  - pyjwt=1.7.1=py37_0
  - pynacl=1.3.0=py37h7b6447c_0
  - pyodbc=4.0.30=py37he6710b0_0
  - pyopenssl=19.1.0=py37_0
  - pyparsing=2.4.6=py_0
  - pysocks=1.7.1=py37_0
  - python=3.7.6=h0371630_2
  - python-dateutil=2.8.1=py_0
  - python-editor=1.0.4=py_0
  - pytorch=1.5.0=py3.7_cpu_0
  - pytz=2019.3=py_0
  - pyzmq=18.1.1=py37he6710b0_0
  - readline=7.0=h7b6447c_5
  - requests=2.22.0=py37_1
  - requests-oauthlib=1.3.0=py_0
  - retrying=1.3.3=py37_2
  - rsa=4.0=py_0
  - s3transfer=0.3.3=py37_0
  - scikit-learn=0.22.1=py37hd81dba3_0
  - scipy=1.4.1=py37h0b6359f_0
  - setuptools=45.2.0=py37_0
  - simplejson=3.17.0=py37h7b6447c_0
  - six=1.14.0=py37_0
  - smmap2=2.0.5=py37_0
  - sqlite=3.31.1=h62c20be_1
  - sqlparse=0.3.0=py_0
  - statsmodels=0.11.0=py37h7b6447c_0
  - tabulate=0.8.3=py37_0
  - tk=8.6.8=hbc83047_0
  - torchvision=0.6.0=py37_cpu
  - tornado=6.0.3=py37h7b6447c_3
  - tqdm=4.42.1=py_0
  - traitlets=4.3.3=py37_0
  - unixodbc=2.3.7=h14c3975_0
  - urllib3=1.25.8=py37_0
  - wcwidth=0.1.8=py_0
  - websocket-client=0.56.0=py37_0
  - werkzeug=1.0.0=py_0
  - wheel=0.34.2=py37_0
  - wrapt=1.11.2=py37h7b6447c_0
  - xz=5.2.4=h14c3975_4
  - zeromq=4.3.1=he6710b0_3
  - zlib=1.2.11=h7b6447c_3
  - zstd=1.3.7=h0b5b093_0
  - pip:
    - astunparse==1.6.3
    - databricks-cli==0.11.0
    - diskcache==4.1.0
    - docker==4.2.1
    - gorilla==0.3.0
    - horovod==0.19.1
    - hyperopt==0.2.4.db1
    - keras-preprocessing==1.1.2
    - mleap==0.16.0
    - mlflow==1.8.0
    - opt-einsum==3.2.1
    - petastorm==0.9.2
    - pyarrow==0.15.1
    - pyyaml==5.3.1
    - querystring-parser==1.2.4
    - seaborn==0.10.0
    - sparkdl==2.1.0-db1
    - tensorboard==2.2.2
    - tensorboard-plugin-wit==1.6.0.post3
    - tensorflow-cpu==2.2.0
    - tensorflow-estimator==2.2.0
    - termcolor==1.1.0
    - xgboost==1.1.1
prefix: /databricks/conda/envs/databricks-ml

GPU 群集上的 PythonPython on GPU clusters

name: databricks-ml-gpu
channels:
  - pytorch
  - defaults
dependencies:
  - _libgcc_mutex=0.1=main
  - absl-py=0.9.0=py37_0
  - asn1crypto=1.3.0=py37_0
  - astor=0.8.0=py37_0
  - backcall=0.1.0=py37_0
  - backports=1.0=py_2
  - bcrypt=3.1.7=py37h7b6447c_1
  - blas=1.0=mkl
  - blinker=1.4=py37_0
  - boto3=1.12.0=py_0
  - botocore=1.15.0=py_0
  - c-ares=1.15.0=h7b6447c_1001
  - ca-certificates=2020.1.1=0
  - cachetools=4.1.0=py_1
  - certifi=2020.4.5.2=py37_0
  - cffi=1.14.0=py37h2e261b9_0
  - chardet=3.0.4=py37_1003
  - click=7.0=py37_0
  - cloudpickle=1.3.0=py_0
  - configparser=3.7.4=py37_0
  - cryptography=2.8=py37h1ba5d50_0
  - cudatoolkit=10.1.243=h6bb024c_0
  - cycler=0.10.0=py37_0
  - cython=0.29.15=py37he6710b0_0
  - decorator=4.4.1=py_0
  - dill=0.3.1.1=py37_1
  - docutils=0.15.2=py37_0
  - entrypoints=0.3=py37_0
  - flask=1.1.1=py_1
  - freetype=2.9.1=h8a8886c_1
  - future=0.18.2=py37_1
  - gast=0.3.3=py_0
  - gitdb2=2.0.6=py_0
  - gitpython=3.0.5=py_0
  - google-auth=1.11.2=py_0
  - google-auth-oauthlib=0.4.1=py_2
  - google-pasta=0.2.0=py_0
  - grpcio=1.27.2=py37hf8bcb03_0
  - gunicorn=20.0.4=py37_0
  - h5py=2.10.0=py37h7918eee_0
  - hdf5=1.10.4=hb1b8bf9_0
  - icu=58.2=he6710b0_3
  - idna=2.8=py37_0
  - intel-openmp=2020.0=166
  - ipykernel=5.1.4=py37h39e3cac_0
  - ipython=7.12.0=py37h5ca1d4c_0
  - ipython_genutils=0.2.0=py37_0
  - itsdangerous=1.1.0=py37_0
  - jedi=0.14.1=py37_0
  - jinja2=2.11.1=py_0
  - jmespath=0.9.4=py_0
  - joblib=0.14.1=py_0
  - jpeg=9b=h024ee3a_2
  - jupyter_client=5.3.4=py37_0
  - jupyter_core=4.6.1=py37_0
  - kiwisolver=1.1.0=py37he6710b0_0
  - krb5=1.16.4=h173b8e3_0
  - ld_impl_linux-64=2.33.1=h53a641e_7
  - libedit=3.1.20181209=hc058e9b_0
  - libffi=3.2.1=hd88cf55_4
  - libgcc-ng=9.1.0=hdf63c60_0
  - libgfortran-ng=7.3.0=hdf63c60_0
  - libpng=1.6.37=hbc83047_0
  - libpq=11.2=h20c2e04_0
  - libprotobuf=3.11.4=hd408876_0
  - libsodium=1.0.16=h1bed415_0
  - libstdcxx-ng=9.1.0=hdf63c60_0
  - libtiff=4.1.0=h2733197_0
  - lightgbm=2.3.0=py37he6710b0_0
  - lz4-c=1.8.1.2=h14c3975_0
  - mako=1.1.2=py_0
  - markdown=3.1.1=py37_0
  - markupsafe=1.1.1=py37h7b6447c_0
  - matplotlib-base=3.1.3=py37hef1b27d_0
  - mkl=2020.0=166
  - mkl-service=2.3.0=py37he904b0f_0
  - mkl_fft=1.0.15=py37ha843d7b_0
  - mkl_random=1.1.0=py37hd6b4f25_0
  - ncurses=6.2=he6710b0_1
  - networkx=2.4=py_0
  - ninja=1.9.0=py37hfd86e86_0
  - nltk=3.4.5=py37_0
  - numpy=1.18.1=py37h4f9e942_0
  - numpy-base=1.18.1=py37hde5b4d6_1
  - oauthlib=3.1.0=py_0
  - olefile=0.46=py37_0
  - openssl=1.1.1g=h7b6447c_0
  - packaging=20.1=py_0
  - pandas=1.0.1=py37h0573a6f_0
  - paramiko=2.7.1=py_0
  - parso=0.5.2=py_0
  - patsy=0.5.1=py37_0
  - pexpect=4.8.0=py37_0
  - pickleshare=0.7.5=py37_0
  - pillow=7.0.0=py37hb39fc2d_0
  - pip=20.0.2=py37_3
  - plotly=4.5.2=py_0
  - prompt_toolkit=3.0.3=py_0
  - protobuf=3.11.4=py37he6710b0_0
  - psutil=5.6.7=py37h7b6447c_0
  - psycopg2=2.8.4=py37h1ba5d50_0
  - ptyprocess=0.6.0=py37_0
  - pyasn1=0.4.8=py_0
  - pyasn1-modules=0.2.7=py_0
  - pycparser=2.19=py37_0
  - pygments=2.5.2=py_0
  - pyjwt=1.7.1=py37_0
  - pynacl=1.3.0=py37h7b6447c_0
  - pyodbc=4.0.30=py37he6710b0_0
  - pyopenssl=19.1.0=py37_0
  - pyparsing=2.4.6=py_0
  - pysocks=1.7.1=py37_0
  - python=3.7.6=h0371630_2
  - python-dateutil=2.8.1=py_0
  - python-editor=1.0.4=py_0
  - pytorch=1.5.0=py3.7_cuda10.1.243_cudnn7.6.3_0
  - pytz=2019.3=py_0
  - pyzmq=18.1.1=py37he6710b0_0
  - readline=7.0=h7b6447c_5
  - requests=2.22.0=py37_1
  - requests-oauthlib=1.3.0=py_0
  - retrying=1.3.3=py37_2
  - rsa=4.0=py_0
  - s3transfer=0.3.3=py37_0
  - scikit-learn=0.22.1=py37hd81dba3_0
  - scipy=1.4.1=py37h0b6359f_0
  - setuptools=45.2.0=py37_0
  - simplejson=3.17.0=py37h7b6447c_0
  - six=1.14.0=py37_0
  - smmap2=2.0.5=py37_0
  - sqlite=3.31.1=h62c20be_1
  - sqlparse=0.3.0=py_0
  - statsmodels=0.11.0=py37h7b6447c_0
  - tabulate=0.8.3=py37_0
  - tk=8.6.8=hbc83047_0
  - torchvision=0.6.0=py37_cu101
  - tornado=6.0.3=py37h7b6447c_3
  - tqdm=4.42.1=py_0
  - traitlets=4.3.3=py37_0
  - unixodbc=2.3.7=h14c3975_0
  - urllib3=1.25.8=py37_0
  - wcwidth=0.1.8=py_0
  - websocket-client=0.56.0=py37_0
  - werkzeug=1.0.0=py_0
  - wheel=0.34.2=py37_0
  - wrapt=1.11.2=py37h7b6447c_0
  - xz=5.2.4=h14c3975_4
  - zeromq=4.3.1=he6710b0_3
  - zlib=1.2.11=h7b6447c_3
  - zstd=1.3.7=h0b5b093_0
  - pip:
    - astunparse==1.6.3
    - databricks-cli==0.11.0
    - diskcache==4.1.0
    - docker==4.2.1
    - gorilla==0.3.0
    - horovod==0.19.1
    - hyperopt==0.2.4.db1
    - keras-preprocessing==1.1.2
    - mleap==0.16.0
    - mlflow==1.8.0
    - opt-einsum==3.2.1
    - petastorm==0.9.2
    - pyarrow==0.15.1
    - pyyaml==5.3.1
    - querystring-parser==1.2.4
    - seaborn==0.10.0
    - sparkdl==2.1.0-db1
    - tensorboard==2.2.2
    - tensorboard-plugin-wit==1.6.0.post3
    - tensorflow-estimator==2.2.0
    - tensorflow-gpu==2.2.0
    - termcolor==1.1.0
    - xgboost==1.1.1
prefix: /databricks/conda/envs/databricks-ml-gpu

包含 Python 模块的 Spark 包Spark packages containing Python modules

Spark 包Spark Package Python 模块Python Module 版本Version
graphframesgraphframes graphframesgraphframes 0.8.0-db2-spark3.00.8.0-db2-spark3.0

R 库R libraries

R 库与 Databricks Runtime 7.0 beta 版本中的 R 库完全相同。The R libraries are identical to the R Libraries in Databricks Runtime 7.0 Beta.

Java 库和 Scala 库(Scala 2.12 群集)Java and Scala libraries (Scala 2.12 cluster)

除了 Databricks Runtime 7.0 中的 Java 库和 Scala 库之外,Databricks Runtime 7.0 ML 还包含以下 JAR:In addition to Java and Scala libraries in Databricks Runtime 7.0, Databricks Runtime 7.0 ML contains the following JARs:

组 IDGroup ID 项目 IDArtifact ID 版本Version
com.typesafe.akkacom.typesafe.akka akka-actor_2.12akka-actor_2.12 2.5.232.5.23
ml.combust.mleapml.combust.mleap mleap-databricks-runtime_2.12mleap-databricks-runtime_2.12 0.17.0-4882dc30.17.0-4882dc3
ml.dmlcml.dmlc xgboost4j-spark_2.12xgboost4j-spark_2.12 1.0.01.0.0
ml.dmlcml.dmlc xgboost4j_2.12xgboost4j_2.12 1.0.01.0.0
org.mlfloworg.mlflow mlflow-clientmlflow-client 1.8.01.8.0
org.scala-lang.modulesorg.scala-lang.modules scala-java8-compat_2.12scala-java8-compat_2.12 0.8.00.8.0
org.tensorfloworg.tensorflow spark-tensorflow-connector_2.12spark-tensorflow-connector_2.12 1.15.01.15.0