Azure Data Science Virtual Machine 中包含哪些工具?What tools are included on the Azure Data Science Virtual Machine?

Data Science Virtual Machine 是在云中浏览数据和进行机器学习的一种简单方法。The Data Science Virtual Machine is an easy way to explore data and do machine learning in the cloud. Data Science Virtual Machine 已预先配置了完整的操作系统、安全修补程序、驱动程序和常用的数据科学和开发软件。The Data Science Virtual Machines are pre-configured with the complete operating system, security patches, drivers, and popular data science and development software. 可以选择硬件环境,选择范围包括低成本的以 CPU 为中心的计算机,以及具有多个 GPU、NVMe 存储和大量内存的极强大计算机。You can choose the hardware environment, ranging from lower-cost CPU-centric machines to very powerful machines with multiple GPUs, NVMe storage, and large amounts of memory. 对于具有 GPU 的计算机,我们安装了所有驱动程序,对所有机器学习框架进行了版本匹配以确保 GPU 兼容性,并且在支持 GPU 的所有应用程序软件中启用了加速。For machines with GPUs, all drivers are installed, all machine learning frameworks are version-matched for GPU compatibility, and acceleration is enabled in all application software that supports GPUs.

Data Science Virtual Machine 预装有非常实用的数据科学工具:The Data Science Virtual Machine comes with the most useful data-science tools pre-installed:

构建深度学习和机器学习解决方案Build deep learning and machine learning solutions

工具Tool Windows DSVMWindows DSVM Linux DSVMLinux DSVM Windows Server 2019 DSVMWindows Server 2019 DSVM Ubuntu 18.04 DSVMUbuntu 18.04 DSVM 使用注意事项Usage notes
CaffeCaffe2Caffe and Caffe2 DSVM 上的 CaffeCaffe2Caffe and Caffe2 on the DSVM
ChainerChainer
(5.2.0)(5.2.0)
DSVM 上的 ChainerChainer on the DSVM
CUDA、cuDNN、NVIDIA 驱动程序CUDA, cuDNN, NVIDIA Driver
(10.0.130)(10.0.130)

(10.0.130)(10.0.130)


DSVM 上的 CUDA、cuDNN、NVIDIA 驱动程序CUDA, cuDNN, NVIDIA Driver on the DSVM
HorovodHorovod
(0.16.1)(0.16.1)

DSVM 上的 HorovodHorovod on the DSVM
KerasKeras
(2.2.4)(2.2.4)

(2.2.4)(2.2.4)


DSVM 上的 KerasKeras on the DSVM
Microsoft Cognitive Toolkit (CNTK)Microsoft Cognitive Toolkit (CNTK)
(2.5.1)(2.5.1)
DSVM 上的 CNTKCNTK on the DSVM
MXNetMXNet
(1.3.0)(1.3.0)
DSVM 上的 MXNetMXNet on the DSVM
MXNet 模型服务器MXNet Model Server
(1.0.1)(1.0.1)
DSVM 上的 MXNet 模型服务器MXNet Model Server on the DSVM
NVidia System Management Interface (nvidia-smi)NVidia System Management Interface (nvidia-smi)
DSVM 上的 nvidia-sminvidia-smi on the DSVM
PyTorchPyTorch
(1.2.0)(1.2.0)

(1.4.0)(1.4.0)

(1.4.0)(1.4.0)
DSVM 上的 PyTorchPyTorch on the DSVM
TensorFlowTensorFlow
(1.13)(1.13)

(1.13)(1.13)


DSVM 上的 TensorFlowTensorFlow on the DSVM
TensorFlow ServingTensorFlow Serving
(1.12.0)(1.12.0)
DSVM 上的 TensorFlow ServingTensorFlow Serving on the DSVM
TheanoTheano
(1.0.3)(1.0.3)
DSVM 上的 TheanoTheano on the DSVM
Azure 机器学习(R、Python)集成Integration with Azure Machine Learning (R, Python)
(0.2.7)(0.2.7)

(1.0.45)(1.0.45)

(Python SDK、示例)(Python SDK, samples)

(Python/R SDK、CLI、示例)(Python/R SDK,CLI, samples)
Azure ML SDKAzure ML SDK
XGBoostXGBoost
(0.81)(0.81)

(0.80)(0.80)

(CUDA 支持)(CUDA support)

(CUDA 支持)(CUDA support)
DSVM 上的 XGBoostXGBoost on the DSVM
Vowpal WabbitVowpal Wabbit
(8.1)(8.1)

DSVM 上的 Vowpal WabbitVowpal Wabbit on the DSVM
WekaWeka
(3.8)(3.8)

(3.8.0)(3.8.0)
LightGBMLightGBM
(GPU 和 MPI 支持)(GPU, MPI support)
H2OH2O
CatBoostCatBoost
Intel MKLIntel MKL
OpenCVOpenCV
DlibDlib
Docker(作为 Moby)Docker (as Moby)
NcclNccl
RattleRattle
ONNX 运行时ONNX Runtime

存储、检索和操作数据Store, retrieve, and manipulate Data

工具Tool Windows DSVMWindows DSVM Linux DSVMLinux DSVM Windows Server 2019 DSVMWindows Server 2019 DSVM Ubuntu 18.04 DSVMUbuntu 18.04 DSVM 使用注意事项Usage notes
关系数据库Relational databases SQL Server 2017SQL Server 2017
Developer EditionDeveloper Edition
SQL Server 2017SQL Server 2017
Developer Edition (Ubuntu)Developer Edition (Ubuntu)
SQL Server 2019SQL Server 2019
Developer EditionDeveloper Edition
SQL Server 2019SQL Server 2019
Developer EditionDeveloper Edition
DSVM 上的 SQL ServerSQL Server on the DSVM
数据库工具Database tools SQL Server Management StudioSQL Server Management Studio
SQL Server Integration ServicesSQL Server Integration Services
bcp、sqlcmdbcp, sqlcmd
ODBC/JDBC 驱动程序ODBC/JDBC drivers
SQuirreL SQL(查询工具)SQuirreL SQL (querying tool),
bcp、sqlcmdbcp, sqlcmd
ODBC/JDBC 驱动程序ODBC/JDBC drivers
SQL Server Management Studio (18.x)SQL Server Management Studio (18.x)
SQL Server Integration ServicesSQL Server Integration Services
bcp、sqlcmdbcp, sqlcmd
SQuirreL SQL(查询工具)SQuirreL SQL (querying tool),
bcp、sqlcmdbcp, sqlcmd
ODBC/JDBC 驱动程序ODBC/JDBC drivers
包含 SQL Server 机器学习服务(R、Python)的可缩放数据库内分析Scalable in-database analytics with SQL Server Machine Learning Services (R, Python)
Azure 存储资源管理器Azure Storage Explorer
(1.10.1)(1.10.1)

(0.7.20160129.1)(0.7.20160129.1)


Azure CLIAzure CLI
(2.0.56)(2.0.56)

(2.0.58)(2.0.58)


AzcopyAzcopy
(8.1.0)(8.1.0)

DSVM 上的 AzcopyAzcopy on the DSVM
Blob FUSE 驱动程序Blob FUSE driver
(1.0.2)(1.0.2)

DSVM 上的 blobfuseblobfuse on the DSVM
Azure Cosmos DB 数据迁移工具Azure Cosmos DB Data Migration Tool DSVM 上的 Cosmos DBCosmos DB on the DSVM
Unix/Linux 命令行工具Unix/Linux command-line tools
用于数据探索的 Apache DrillApache Drill for data exploration
(1.14.0)(1.14.0)

DSVM 上的 Apache 演练Apache Drill on the DSVM
Apache Spark(独立版)Apache Spark (standalone)

使用 Python、R、Julia 和 Node.js 编程Program in Python, R, Julia, and Node.js

工具Tool Windows DSVMWindows DSVM Linux DSVMLinux DSVM Windows Server 2019 DSVMWindows Server 2019 DSVM Ubuntu 18.04 DSVMUbuntu 18.04 DSVM 使用注意事项Usage notes
预安装了常用包的 CRAN-RCRAN-R with popular packages pre-installed
预安装了常用包的 Microsoft R OpenMicrosoft R Open with popular packages pre-installed
(3.4.3)(3.4.3)

(3.4.3)(3.4.3)
DSVM 上的 RR on the DSVM
Microsoft Machine Learning Server(R、Python)开发人员版包括:Microsoft Machine Learning Server (R, Python) Developer Edition includes:
RevoScaleR/revoscalepy 并行分布式高性能框架(R 和 Python)RevoScaleR/revoscalepy parallel and distributed high-performance framework (R and Python)
MicrosoftML:Microsoft 提供的全新一流机器学习算法MicrosoftML, new state-of-the-art machine learning algorithms from Microsoft
R 和 Python 操作化R and Python operationalization
预安装了常用包的 Anaconda PythonAnaconda Python with popular packages pre-installed
(4.2)(4.2)

(Miniconda)(Miniconda)

(Miniconda)(Miniconda)
Julia (Julialang)Julia (Julialang)

预安装了具有 Julia 语言的常用包的 JuliaProJuliaPro with popular packages for Julia language pre-installed
(0.6.4)(0.6.4)

(0.6.2)(0.6.2)
DSVM 上的 JuliaJulia on the DSVM
JupyterHub(多用户 Notebook 服务器)JupyterHub (multiuser notebook server)
JupyterLab(多用户 Notebook 服务器)JupyterLab (multiuser notebook server)
Node.jsNode.js
带有以下内核的 Jupyter Notebook ServerJupyter Notebook Server with the following kernels:
(5.5.0)(5.5.0)

Jupyter Notebook 示例Jupyter Notebook samples
     R     R
(3.4.3)(3.4.3)

(3.4.3)(3.4.3)
R Jupyter 示例R Jupyter Samples
     Python     Python
(3)(3)
Python Jupyter 示例Python Jupyter Samples
     Julia     Julia
(0.6.4)(0.6.4)

(0.6.2)(0.6.2)
Julia Jupyter 示例Julia Jupyter Samples
     PySpark     PySpark pySpark Jupyter 示例pySpark Jupyter Samples
     Sparkmagic     Sparkmagic
(仅适用于 Ubuntu)(Ubuntu only)
     SparkR     SparkR

Ubuntu 18.04 DSVM 和 Windows Server 2019 DSVM 具有以下 Jupyter 内核:Ubuntu 18.04 DSVM and Windows Server 2019 DSVM has the following Jupyter Kernels:-

  • Python 3.7 - 默认Python 3.7 - default
  • Python 3.7 - PyTorchPython 3.7 - PyTorch
  • Python 3.7 - TensorFlowPython 3.7 - TensorFlow
  • Python 3.6 - AzureML - TensorFlowPython 3.6 - AzureML - TensorFlow
  • Python 3.6 - AzureML - PyTorchPython 3.6 - AzureML - PyTorch
  • Python 3.6 - AzureML - AutoMLPython 3.6 - AzureML – AutoML
  • RR
  • Python 3.7 - Spark(本地)Python 3.7 - Spark (local)
  • Julia 1.2.0Julia 1.2.0
  • R Spark – HDInsightR Spark – HDInsight
  • Scala Spark – HDInsightScala Spark – HDInsight
  • Python 3 Spark - HDInsightPython 3 Spark – HDInsight

Ubuntu 18.04 DSVM 和 Windows Server 2019 DSVM 具有以下 conda 环境:Ubuntu 18.04 DSVM and Windows Server 2019 DSVM has the following conda environments:-

  • py37_defaultpy37_default
  • py37_tensorflowpy37_tensorflow
  • py37_pytorchpy37_pytorch
  • azureml_py36_tensorflowazureml_py36_tensorflow
  • azureml_py36_pytorchazureml_py36_pytorch
  • azureml_py36_automlazureml_py36_automl

Ubuntu 16.04 DSVM 具有以下 conda 环境:Ubuntu 16.04 DSVM has the following conda environments:-

  • basebase
  • py37py37
  • azureml_py36azureml_py36

Windows Server 2016 具有以下 conda 环境:Windows Server 2016 has the following conda environments:-

  • basebase  
  • AzureMLAzureML
  • python2python2

如何选择 conda 环境How to choose conda environment

备注

从旧版 Data Science Virtual Machine 切换到新版时,存在中断性变更。There are breaking changes when switching to the new version of Data Science Virtual Machine from the old version.

从 Windows Server 2016 切换到 Windows Server 2019Switching from Windows Server 2016 to Windows Server 2019

  • conda 环境设置为 python37_default。The conda environment is set to python37_default. 我们不支持 python2。We don't support python2. 如果在 python conda 环境中使用 tensorflow 或 pytorch,请分别使用 py37_tensorflow 或 py37_pytorch。If you were using tensorflow or pytorch in your python conda environment use py37_tensorflow or py37_pytorch respectively. 如果仅使用 AzureML,请使用 azureml_py36_automlIf you were using only AzureML use azureml_py36_automl

  • 如果在 AzureML conda 环境中使用 tensorflow 或 pytorch,请分别使用 azureml_py36_tensorflow 或 azureml_py36_pytorch。If you were using tensorflow or pytorch in your AzureML conda environment use azureml_py36_tensorflow or azureml_py36_pytorch respectively.

从 Ubuntu 16.04 切换到 Ubuntu 18.04Switching from Ubuntu 16.04 to Ubuntu 18.04

  • 如果在 azureml_py36 conda 环境中使用 tensorflow 或 pytorch,请分别使用 azureml_py36_tensorflow 或 azureml_py36_pytorch。If you were using tensorflow or pytorch in your azureml_py36 conda environment use azureml_py36_tensorflow or azureml_py36_pytorch respectively.

  • 如果在 py37 conda 环境中使用 tensorflow 或 pytorch,请分别使用 py37_tensorflow 或 py37_pytorch。If you were using tensorflow or pytorch in your py37 conda environment use py37_tensorflow or py37_pytorch respectively.

  • 否则请使用 python37_default;或者,如果仅使用 azureml_py36,请使用 azureml_py36_automlOtherwise, use python37_default or if you were using only azureml_py36 use azureml_py36_automl

使用你喜欢的编辑器或 IDEUse your preferred editor or IDE

工具Tool Windows DSVMWindows DSVM Linux DSVMLinux DSVM Windows Server 2019 DSVMWindows Server 2019 DSVM Ubuntu 18.04 DSVMUbuntu 18.04 DSVM 使用注意事项Usage notes
Notepad++Notepad++
(1.31.1)(1.31.1)

(1.31)(1.31)


NanoNano
(1.31.1)(1.31.1)

(1.31)(1.31)


Visual Studio (Community Edition)Visual Studio (Community Edition) with
Git 插件、Azure HDInsight (Hadoop)、Azure Data Lake、SQL Server Data Tools、Git plug-in, Azure HDInsight (Hadoop), Azure Data Lake, SQL Server Data Tools,
Node.jsPythonNode.js, Python, and
针对 Visual Studio 的 R 工具 (RTVS)R Tools for Visual Studio (RTVS)

(Visual Studio 2017)(Visual Studio 2017)

(Visual Studio 2019)(Visual Studio 2019)
DSVM 上的 Visual StudioVisual Studio on the DSVM
Visual Studio CodeVisual Studio Code
(1.31.1)(1.31.1)

(1.31)(1.31)


DSVM 上的 Visual Studio CodeVisual Studio Code on the DSVM
RStudio DesktopRStudio Desktop
(1.2.50xx)(1.2.50xx)

(1.1.456)(1.1.456)


DSVM 上的 RStudio DesktopRStudio Desktop on the DSVM
RStudio ServerRStudio Server DSVM 上的 RStudio ServerRStudio Server on the DSVM
PyCharm Community EditionPyCharm Community Edition
(19.2.3)(19.2.3)

(2018.2.3)(2018.2.3)


DSVM 上的 PyCharmPyCharm on the DSVM
IntelliJ IDEAIntelliJ IDEA
AtomAtom
(1.26.1)(1.26.1)
Juno (Julia IDE)Juno (Julia IDE) DSVM 上的 JunoJuno on the DSVM
VimVim
(8.1.5)(8.1.5)

(7.4.1689)(7.4.1689)

EmacsEmacs
(24.5.1)(24.5.1)

Git 和 Git BashGit and Git Bash
(2.20.1)(2.20.1)

(0.6.2)(0.6.2)


OpenJDKOpenJDK
(1.8.0_201)(1.8.0_201)

(1.8.0_222)(1.8.0_222)


.NET framework.NET Framework
(4.7.2)(4.7.2)

Azure SDKAzure SDK

组织和展示结果Organize & present results

工具Tool Windows DSVMWindows DSVM Linux DSVMLinux DSVM Windows Server 2019 DSVMWindows Server 2019 DSVM Ubuntu 18.04 DSVMUbuntu 18.04 DSVM 使用注意事项Usage notes
支持共享激活的 Microsoft Office ProPlus:Excel、Word 和 PowerPointMicrosoft Office ProPlus with shared activation: Excel, Word, and PowerPoint
Power BI DesktopPower BI Desktop
(2.73.55xx)(2.73.55xx)

Microsoft Edge 浏览器Microsoft Edge Browser