Databricks Runtime 7.1 for ML (EoS)
Note
Support for this Databricks Runtime version has ended. For the end-of-support date, see End-of-support history. For all supported Databricks Runtime versions, see Databricks Runtime release notes versions and compatibility.
Databricks released this version in July 2020.
Databricks Runtime 7.1 for Machine Learning provides a ready-to-go environment for machine learning and data science based on Databricks Runtime 7.1 (EoS). Databricks Runtime ML contains many popular machine learning libraries, including TensorFlow, PyTorch, and XGBoost. It also supports distributed deep learning training using Horovod.
For more information, including instructions for creating a Databricks Runtime ML cluster, see AI and machine learning on Databricks.
New features and major changes
Databricks Runtime 7.1 ML is built on top of Databricks Runtime 7.1. For information on what's new in Databricks Runtime 7.1, including Apache Spark MLlib and SparkR, see the Databricks Runtime 7.1 (EoS) release notes.
Major changes to Databricks Runtime ML Python environment
This section describes the major changes to the installed Databricks Runtime ML Python environment compared to Databricks Runtime 7.0 ML (EoS). You should also review the major changes to the Databricks Runtime Python environment in Databricks Runtime 7.1 (EoS). For a full list of installed Python packages and their versions, see Python libraries.
pip and conda magic commands now enabled by default (Public Preview)
In Databricks Runtime ML 6.4 and above, %pip
and %conda
magic commands were available via a cluster configuation setting. These commands are now enabled by default.
For more information, see Notebook-scoped Python libraries.
Python packages upgraded
- pillow 7.0.0 -> 7.1.0
- pytorch 1.5.0 -> 1.5.1
- torchvision 0.6.0 -> 0.6.1
- horovod 0.19.1 -> 0.19.5
- mlflow 1.8.0 -> 1.9.1
Python packages added
- spark-tensorflow-distributor: 0.1.0
Changes to ML Spark packages, Java and Scala libraries
The following packages are upgraded:
- mlflow-client: 1.9.1
System environment
The system environment in Databricks Runtime 7.1 ML differs from Databricks Runtime 7.1 as follows:
- DBUtils: Databricks Runtime ML does not contain Library utility (dbutils.library) (legacy).
You can use
%pip
and%conda
commands instead. See Notebook-scoped Python libraries. - For GPU clusters, the following NVIDIA GPU libraries:
- CUDA 10.1 Update 2
- cuDNN 7.6.5
- NCCL 2.7.3
- TensorRT 6.0.1
Libraries
The following sections list the libraries included in Databricks Runtime 7.1 ML that differ from those included in Databricks Runtime 7.1.
In this section:
Top-tier libraries
Databricks Runtime 7.1 ML includes the following top-tier libraries:
- GraphFrames
- Horovod and HorovodRunner
- MLflow
- PyTorch
- spark-tensorflow-connector
- TensorFlow
- TensorBoard
Python libraries
Databricks Runtime 7.1 ML uses Conda for Python package management and includes many popular ML packages. The following section describes the Conda environment for Databricks Runtime 7.1 ML.
Python on CPU clusters
name: databricks-ml
channels:
- pytorch
- defaults
dependencies:
- _libgcc_mutex=0.1=main
- absl-py=0.9.0=py37_0
- asn1crypto=1.3.0=py37_0
- astor=0.8.0=py37_0
- backcall=0.1.0=py37_0
- backports=1.0=py_2
- bcrypt=3.1.7=py37h7b6447c_1
- blas=1.0=mkl
- blinker=1.4=py37_0
- boto3=1.12.0=py_0
- botocore=1.15.0=py_0
- c-ares=1.15.0=h7b6447c_1001
- ca-certificates=2020.6.24=0
- cachetools=4.1.0=py_1
- certifi=2020.6.20=py37_0
- cffi=1.14.0=py37h2e261b9_0
- chardet=3.0.4=py37_1003
- click=7.0=py37_0
- cloudpickle=1.3.0=py_0
- configparser=3.7.4=py37_0
- cpuonly=1.0=0
- cryptography=2.8=py37h1ba5d50_0
- cycler=0.10.0=py37_0
- cython=0.29.15=py37he6710b0_0
- decorator=4.4.1=py_0
- dill=0.3.1.1=py37_1
- docutils=0.15.2=py37_0
- entrypoints=0.3=py37_0
- flask=1.1.1=py_1
- freetype=2.9.1=h8a8886c_1
- future=0.18.2=py37_1
- gast=0.3.3=py_0
- gitdb=4.0.5=py_0
- gitdb2=4.0.2=py_0
- gitpython=3.0.5=py_0
- google-auth=1.11.2=py_0
- google-auth-oauthlib=0.4.1=py_2
- google-pasta=0.2.0=py_0
- grpcio=1.27.2=py37hf8bcb03_0
- gunicorn=20.0.4=py37_0
- h5py=2.10.0=py37h7918eee_0
- hdf5=1.10.4=hb1b8bf9_0
- icu=58.2=he6710b0_3
- idna=2.8=py37_0
- intel-openmp=2020.0=166
- ipykernel=5.1.4=py37h39e3cac_0
- ipython=7.12.0=py37h5ca1d4c_0
- ipython_genutils=0.2.0=py37_0
- isodate=0.6.0=py_1
- itsdangerous=1.1.0=py37_0
- jedi=0.14.1=py37_0
- jinja2=2.11.1=py_0
- jmespath=0.9.4=py_0
- joblib=0.14.1=py_0
- jpeg=9b=h024ee3a_2
- jupyter_client=5.3.4=py37_0
- jupyter_core=4.6.1=py37_0
- kiwisolver=1.1.0=py37he6710b0_0
- krb5=1.16.4=h173b8e3_0
- ld_impl_linux-64=2.33.1=h53a641e_7
- libedit=3.1.20181209=hc058e9b_0
- libffi=3.2.1=hd88cf55_4
- libgcc-ng=9.1.0=hdf63c60_0
- libgfortran-ng=7.3.0=hdf63c60_0
- libpng=1.6.37=hbc83047_0
- libpq=11.2=h20c2e04_0
- libprotobuf=3.11.4=hd408876_0
- libsodium=1.0.16=h1bed415_0
- libstdcxx-ng=9.1.0=hdf63c60_0
- libtiff=4.1.0=h2733197_0
- lightgbm=2.3.0=py37he6710b0_0
- lz4-c=1.8.1.2=h14c3975_0
- mako=1.1.2=py_0
- markdown=3.1.1=py37_0
- markupsafe=1.1.1=py37h7b6447c_0
- matplotlib-base=3.1.3=py37hef1b27d_0
- mkl=2020.0=166
- mkl-service=2.3.0=py37he904b0f_0
- mkl_fft=1.0.15=py37ha843d7b_0
- mkl_random=1.1.0=py37hd6b4f25_0
- ncurses=6.2=he6710b0_1
- networkx=2.4=py_0
- ninja=1.9.0=py37hfd86e86_0
- nltk=3.4.5=py37_0
- numpy=1.18.1=py37h4f9e942_0
- numpy-base=1.18.1=py37hde5b4d6_1
- oauthlib=3.1.0=py_0
- olefile=0.46=py37_0
- openssl=1.1.1g=h7b6447c_0
- packaging=20.1=py_0
- pandas=1.0.1=py37h0573a6f_0
- paramiko=2.7.1=py_0
- parso=0.5.2=py_0
- patsy=0.5.1=py37_0
- pexpect=4.8.0=py37_0
- pickleshare=0.7.5=py37_0
- pillow=7.0.0=py37hb39fc2d_0
- pip=20.0.2=py37_3
- plotly=4.8.1=py_0
- prompt_toolkit=3.0.3=py_0
- protobuf=3.11.4=py37he6710b0_0
- psutil=5.6.7=py37h7b6447c_0
- psycopg2=2.8.4=py37h1ba5d50_0
- ptyprocess=0.6.0=py37_0
- pyasn1=0.4.8=py_0
- pyasn1-modules=0.2.7=py_0
- pycparser=2.19=py37_0
- pygments=2.5.2=py_0
- pyjwt=1.7.1=py37_0
- pynacl=1.3.0=py37h7b6447c_0
- pyodbc=4.0.30=py37he6710b0_0
- pyopenssl=19.1.0=py37_0
- pyparsing=2.4.6=py_0
- pysocks=1.7.1=py37_0
- python=3.7.6=h0371630_2
- python-dateutil=2.8.1=py_0
- python-editor=1.0.4=py_0
- pytorch=1.5.1=py3.7_cpu_0
- pytz=2019.3=py_0
- pyzmq=18.1.1=py37he6710b0_0
- readline=7.0=h7b6447c_5
- requests=2.22.0=py37_1
- requests-oauthlib=1.3.0=py_0
- retrying=1.3.3=py37_2
- rsa=4.0=py_0
- s3transfer=0.3.3=py37_0
- scikit-learn=0.22.1=py37hd81dba3_0
- scipy=1.4.1=py37h0b6359f_0
- setuptools=45.2.0=py37_0
- simplejson=3.17.0=py37h7b6447c_0
- six=1.14.0=py37_0
- smmap=3.0.2=py_0
- sqlite=3.31.1=h62c20be_1
- sqlparse=0.3.0=py_0
- statsmodels=0.11.0=py37h7b6447c_0
- tabulate=0.8.3=py37_0
- tk=8.6.8=hbc83047_0
- torchvision=0.6.1=py37_cpu
- tornado=6.0.3=py37h7b6447c_3
- tqdm=4.42.1=py_0
- traitlets=4.3.3=py37_0
- unixodbc=2.3.7=h14c3975_0
- urllib3=1.25.8=py37_0
- wcwidth=0.1.8=py_0
- websocket-client=0.56.0=py37_0
- werkzeug=1.0.0=py_0
- wheel=0.34.2=py37_0
- wrapt=1.11.2=py37h7b6447c_0
- xz=5.2.4=h14c3975_4
- zeromq=4.3.1=he6710b0_3
- zlib=1.2.11=h7b6447c_3
- zstd=1.3.7=h0b5b093_0
- pip:
- astunparse==1.6.3
- azure-core==1.6.0
- azure-storage-blob==12.3.2
- databricks-cli==0.11.0
- diskcache==4.1.0
- docker==4.2.2
- gorilla==0.3.0
- horovod==0.19.5
- hyperopt==0.2.4.db2
- keras-preprocessing==1.1.2
- koalas==1.0.1
- mleap==0.16.0
- mlflow==1.9.1
- msrest==0.6.17
- opt-einsum==3.2.1
- petastorm==0.9.2
- pyarrow==0.15.1
- pyyaml==5.3.1
- querystring-parser==1.2.4
- seaborn==0.10.0
- spark-tensorflow-distributor==0.1.0
- sparkdl==2.1.0-db1
- tensorboard==2.2.2
- tensorboard-plugin-wit==1.7.0
- tensorflow-cpu==2.2.0
- tensorflow-estimator==2.2.0
- termcolor==1.1.0
- xgboost==1.1.1
prefix: /databricks/conda/envs/databricks-ml
Python on GPU clusters
name: databricks-ml-gpu
channels:
- pytorch
- defaults
dependencies:
- _libgcc_mutex=0.1=main
- absl-py=0.9.0=py37_0
- asn1crypto=1.3.0=py37_0
- astor=0.8.0=py37_0
- backcall=0.1.0=py37_0
- backports=1.0=py_2
- bcrypt=3.1.7=py37h7b6447c_1
- blas=1.0=mkl
- blinker=1.4=py37_0
- boto3=1.12.0=py_0
- botocore=1.15.0=py_0
- c-ares=1.15.0=h7b6447c_1001
- ca-certificates=2020.6.24=0
- cachetools=4.1.0=py_1
- certifi=2020.6.20=py37_0
- cffi=1.14.0=py37h2e261b9_0
- chardet=3.0.4=py37_1003
- click=7.0=py37_0
- cloudpickle=1.3.0=py_0
- configparser=3.7.4=py37_0
- cryptography=2.8=py37h1ba5d50_0
- cudatoolkit=10.1.243=h6bb024c_0
- cycler=0.10.0=py37_0
- cython=0.29.15=py37he6710b0_0
- decorator=4.4.1=py_0
- dill=0.3.1.1=py37_1
- docutils=0.15.2=py37_0
- entrypoints=0.3=py37_0
- flask=1.1.1=py_1
- freetype=2.9.1=h8a8886c_1
- future=0.18.2=py37_1
- gast=0.3.3=py_0
- gitdb=4.0.5=py_0
- gitdb2=4.0.2=py_0
- gitpython=3.0.5=py_0
- google-auth=1.11.2=py_0
- google-auth-oauthlib=0.4.1=py_2
- google-pasta=0.2.0=py_0
- grpcio=1.27.2=py37hf8bcb03_0
- gunicorn=20.0.4=py37_0
- h5py=2.10.0=py37h7918eee_0
- hdf5=1.10.4=hb1b8bf9_0
- icu=58.2=he6710b0_3
- idna=2.8=py37_0
- intel-openmp=2020.0=166
- ipykernel=5.1.4=py37h39e3cac_0
- ipython=7.12.0=py37h5ca1d4c_0
- ipython_genutils=0.2.0=py37_0
- isodate=0.6.0=py_1
- itsdangerous=1.1.0=py37_0
- jedi=0.14.1=py37_0
- jinja2=2.11.1=py_0
- jmespath=0.9.4=py_0
- joblib=0.14.1=py_0
- jpeg=9b=h024ee3a_2
- jupyter_client=5.3.4=py37_0
- jupyter_core=4.6.1=py37_0
- kiwisolver=1.1.0=py37he6710b0_0
- krb5=1.16.4=h173b8e3_0
- ld_impl_linux-64=2.33.1=h53a641e_7
- libedit=3.1.20181209=hc058e9b_0
- libffi=3.2.1=hd88cf55_4
- libgcc-ng=9.1.0=hdf63c60_0
- libgfortran-ng=7.3.0=hdf63c60_0
- libpng=1.6.37=hbc83047_0
- libpq=11.2=h20c2e04_0
- libprotobuf=3.11.4=hd408876_0
- libsodium=1.0.16=h1bed415_0
- libstdcxx-ng=9.1.0=hdf63c60_0
- libtiff=4.1.0=h2733197_0
- lightgbm=2.3.0=py37he6710b0_0
- lz4-c=1.8.1.2=h14c3975_0
- mako=1.1.2=py_0
- markdown=3.1.1=py37_0
- markupsafe=1.1.1=py37h7b6447c_0
- matplotlib-base=3.1.3=py37hef1b27d_0
- mkl=2020.0=166
- mkl-service=2.3.0=py37he904b0f_0
- mkl_fft=1.0.15=py37ha843d7b_0
- mkl_random=1.1.0=py37hd6b4f25_0
- ncurses=6.2=he6710b0_1
- networkx=2.4=py_0
- ninja=1.9.0=py37hfd86e86_0
- nltk=3.4.5=py37_0
- numpy=1.18.1=py37h4f9e942_0
- numpy-base=1.18.1=py37hde5b4d6_1
- oauthlib=3.1.0=py_0
- olefile=0.46=py37_0
- openssl=1.1.1g=h7b6447c_0
- packaging=20.1=py_0
- pandas=1.0.1=py37h0573a6f_0
- paramiko=2.7.1=py_0
- parso=0.5.2=py_0
- patsy=0.5.1=py37_0
- pexpect=4.8.0=py37_0
- pickleshare=0.7.5=py37_0
- pillow=7.0.0=py37hb39fc2d_0
- pip=20.0.2=py37_3
- plotly=4.8.1=py_0
- prompt_toolkit=3.0.3=py_0
- protobuf=3.11.4=py37he6710b0_0
- psutil=5.6.7=py37h7b6447c_0
- psycopg2=2.8.4=py37h1ba5d50_0
- ptyprocess=0.6.0=py37_0
- pyasn1=0.4.8=py_0
- pyasn1-modules=0.2.7=py_0
- pycparser=2.19=py37_0
- pygments=2.5.2=py_0
- pyjwt=1.7.1=py37_0
- pynacl=1.3.0=py37h7b6447c_0
- pyodbc=4.0.30=py37he6710b0_0
- pyopenssl=19.1.0=py37_0
- pyparsing=2.4.6=py_0
- pysocks=1.7.1=py37_0
- python=3.7.6=h0371630_2
- python-dateutil=2.8.1=py_0
- python-editor=1.0.4=py_0
- pytorch=1.5.1=py3.7_cuda10.1.243_cudnn7.6.3_0
- pytz=2019.3=py_0
- pyzmq=18.1.1=py37he6710b0_0
- readline=7.0=h7b6447c_5
- requests=2.22.0=py37_1
- requests-oauthlib=1.3.0=py_0
- retrying=1.3.3=py37_2
- rsa=4.0=py_0
- s3transfer=0.3.3=py37_0
- scikit-learn=0.22.1=py37hd81dba3_0
- scipy=1.4.1=py37h0b6359f_0
- setuptools=45.2.0=py37_0
- simplejson=3.17.0=py37h7b6447c_0
- six=1.14.0=py37_0
- smmap=3.0.2=py_0
- sqlite=3.31.1=h62c20be_1
- sqlparse=0.3.0=py_0
- statsmodels=0.11.0=py37h7b6447c_0
- tabulate=0.8.3=py37_0
- tk=8.6.8=hbc83047_0
- torchvision=0.6.1=py37_cu101
- tornado=6.0.3=py37h7b6447c_3
- tqdm=4.42.1=py_0
- traitlets=4.3.3=py37_0
- unixodbc=2.3.7=h14c3975_0
- urllib3=1.25.8=py37_0
- wcwidth=0.1.8=py_0
- websocket-client=0.56.0=py37_0
- werkzeug=1.0.0=py_0
- wheel=0.34.2=py37_0
- wrapt=1.11.2=py37h7b6447c_0
- xz=5.2.4=h14c3975_4
- zeromq=4.3.1=he6710b0_3
- zlib=1.2.11=h7b6447c_3
- zstd=1.3.7=h0b5b093_0
- pip:
- astunparse==1.6.3
- azure-core==1.6.0
- azure-storage-blob==12.3.2
- databricks-cli==0.11.0
- diskcache==4.1.0
- docker==4.2.2
- gorilla==0.3.0
- horovod==0.19.5
- hyperopt==0.2.4.db1
- keras-preprocessing==1.1.2
- koalas==1.0.1
- mleap==0.16.0
- mlflow==1.9.1
- msrest==0.6.17
- opt-einsum==3.2.1
- petastorm==0.9.2
- pyarrow==0.15.1
- pyyaml==5.3.1
- querystring-parser==1.2.4
- seaborn==0.10.0
- spark-tensorflow-distributor==0.1.0
- sparkdl==2.1.0-db1
- tensorboard==2.2.2
- tensorboard-plugin-wit==1.7.0
- tensorflow==2.2.0
- tensorflow-estimator==2.2.0
- termcolor==1.1.0
- xgboost==1.1.1
prefix: /databricks/conda/envs/databricks-ml-gpu
Spark packages containing Python modules
Spark Package | Python Module | Version |
---|---|---|
graphframes | graphframes | 0.8.0-db2-spark3.0 |
R libraries
The R libraries are identical to the R Libraries in Databricks Runtime 7.1.
Java and Scala libraries (Scala 2.12 cluster)
In addition to Java and Scala libraries in Databricks Runtime 7.1, Databricks Runtime 7.1 ML contains the following JARs:
Group ID | Artifact ID | Version |
---|---|---|
com.typesafe.akka | akka-actor_2.12 | 2.5.23 |
ml.combust.mleap | mleap-databricks-runtime_2.12 | 0.17.1-4882dc3 |
ml.dmlc | xgboost4j-spark_2.12 | 1.0.0 |
ml.dmlc | xgboost4j_2.12 | 1.0.0 |
org.mlflow | mlflow-client | 1.9.1 |
org.scala-lang.modules | scala-java8-compat_2.12 | 0.8.0 |
org.tensorflow | spark-tensorflow-connector_2.12 | 1.15.0 |