What tools does the Azure Data Science Virtual Machine include?
You can use the Data Science Virtual Machine (DSVM) to easily explore data and handle machine learning in the cloud. A DSVM is preconfigured with security patches, drivers, popular data science and development software, and a complete operating system. You can choose the hardware environment that works for you, ranging from lower-cost CPU-centric machines to powerful machines with multiple GPUs, NVMe storage, and large amounts of memory. For machines with GPUs, all drivers are installed, and all machine learning frameworks are version-matched for GPU compatibility. Additionally, acceleration is enabled in all application software that supports GPUs.
The DSVM comes with the most useful data-science tools preinstalled.
Build deep learning and machine learning solutions
Tool | Windows Server 2019 DSVM | Windows Server 2022 DSVM | Ubuntu 20.04 DSVM | Usage notes |
---|---|---|---|---|
CUDA, cuDNN, NVIDIA Driver | ✅ | ✅ | ✅ |
CUDA, cuDNN, NVIDIA Driver on the DSVM |
Horovod | ❌ | ❌ | ✅ | Horovod on the DSVM |
NVidia System Management Interface (nvidia-smi) | ✅ | ✅ | ✅ | nvidia-smi on the DSVM |
PyTorch | ✅ | ✅ | ✅ | PyTorch on the DSVM |
TensorFlow | ✅ | ✅ |
✅ | TensorFlow on the DSVM |
Integration with Azure Machine Learning (Python) | ✅ (Python SDK, samples) |
✅ (Python SDK, samples) |
✅ (Python SDK,CLI, samples) |
Azure Machine Learning SDK |
XGBoost | ✅ (CUDA support) |
✅ (CUDA support) |
✅ (CUDA support) |
XGBoost on the DSVM |
Vowpal Wabbit | ✅ | ✅ | ✅ |
Vowpal Wabbit on the DSVM |
Weka | ❌ | ❌ | ❌ | |
LightGBM | ❌ | ❌ | ✅ (GPU, MPI support) |
|
H2O | ❌ | ❌ | ✅ | |
CatBoost | ❌ | ❌ | ✅ | |
Intel MKL | ❌ | ❌ | ✅ | |
OpenCV | ❌ | ❌ | ✅ | |
Dlib | ❌ | ❌ | ✅ | |
Docker | ✅ (Windows containers only) |
✅ (Windows containers only) |
✅ | |
Nccl | ❌ | ❌ | ✅ | |
Rattle | ❌ | ❌ | ❌ | |
ONNX Runtime | ❌ | ❌ | ✅ |
Store, retrieve, and manipulate data
Tool | Windows Server 2019 DSVM | Windows Server 2022 DSVM | Ubuntu 20.04 DSVM | Usage notes |
---|---|---|---|---|
Relational databases | SQL Server 2019 Developer Edition |
SQL Server 2019 Developer Edition |
SQL Server 2019 Developer Edition |
SQL Server on the DSVM |
Database tools | SQL Server Management Studio SQL Server Integration Services bcp, sqlcmd |
SQL Server Management Studio SQL Server Integration Services bcp, sqlcmd |
SQuirreL SQL (querying tool), bcp, sqlcmd ODBC/JDBC drivers |
|
Azure Storage Explorer | ✅ |
✅ |
||
Azure CLI | ✅ |
✅ |
✅ |
|
AzCopy | ✅ |
✅ |
❌ | AzCopy on the DSVM |
Blob FUSE driver | ❌ | ❌ | ❌ |
blobfuse on the DSVM |
Azure Cosmos DB Data Migration Tool | ✅ | ✅ | ❌ | Azure Cosmos DB on the DSVM |
Unix/Linux command-line tools | ❌ | ❌ | ✅ | |
Apache Spark 3.1 (standalone) | ✅ | ✅ | ✅ |
Program in Python, R, Julia, and Node.js
Tool | Windows Server 2019 DSVM | Windows Server 2022 DSVM | Ubuntu 20.04 DSVM | Usage notes |
---|---|---|---|---|
CRAN-R with popular packages preinstalled | ✅ | ✅ | ✅ | |
Anaconda Python with popular packages preinstalled | ✅ | ✅ (Miniconda) |
✅ (Miniconda) |
|
Julia (Julialang) | ✅ | ✅ | ✅ | |
JupyterHub (multiuser notebook server) | ❌ | ❌ | ✅ | |
JupyterLab (multiuser notebook server) | ✅ | ✅ | ✅ | |
Node.js | ✅ | ✅ | ✅ | |
Jupyter Notebook Server with the following kernels: | ✅ |
✅ |
✅ | Jupyter Notebook samples |
R | R Jupyter Samples | |||
Python | Python Jupyter Samples | |||
Julia | Julia Jupyter Samples | |||
PySpark | pySpark Jupyter Samples |
Ubuntu 20.04 DSVM, Windows Server 2019 DSVM, and Windows Server 2022 DSVM have these Jupyter Kernels:
- Python3.8-default
- Python3.8-Tensorflow-Pytorch
- Python3.8-AzureML
- R
- Python 3.7 - Spark (local)
- Julia 1.6.0
- R Spark – HDInsight
- Scala Spark – HDInsight
- Python 3 Spark – HDInsight
Ubuntu 20.04 DSVM, Windows Server 2019 DSVM, and Windows Server 2022 DSVM have the following conda environments:
- Python3.8-default
- Python3.8-Tensorflow-Pytorch
- Python3.8-AzureML
Use your preferred editor or IDE
Tool | Windows Server 2019 DSVM | Windows Server 2022 DSVM | Ubuntu 20.04 DSVM | Usage notes |
---|---|---|---|---|
Notepad++ | ✅ |
✅ |
❌ |
|
Nano | ✅ |
✅ |
❌ |
|
Visual Studio 2019 Community Edition | ✅ |
✅ | ❌ | Visual Studio on the DSVM |
Visual Studio Code | ✅ |
✅ |
✅ |
Visual Studio Code on the DSVM |
PyCharm Community Edition | ✅ |
✅ |
✅ |
PyCharm on the DSVM |
IntelliJ IDEA | ❌ | ❌ | ✅ | |
Vim | ❌ | ❌ | ✅ |
|
Emacs | ❌ | ❌ | ✅ |
|
Git and Git Bash | ✅ |
✅ |
✅ |
|
OpenJDK 11 | ✅ |
✅ |
✅ |
|
.NET Framework | ✅ |
✅ |
❌ | |
Azure SDK | ✅ |
✅ | ✅ |
Organize & present results
Tool | Windows Server 2019 DSVM | Windows Server 2022 DSVM | Ubuntu 20.04 DSVM | Usage notes |
---|---|---|---|---|
Microsoft 365 (Word, Excel, PowerPoint) | ✅ | ✅ | ❌ | |
Microsoft Teams | ✅ | ✅ | ❌ | |
Power BI Desktop | ✅ | ✅ |
❌ | |
Microsoft Edge Browser | ✅ | ✅ | ✅ |