什么是适用于 Linux 和 Windows 的 Azure Data Science Virtual Machine?What is the Azure Data Science Virtual Machine for Linux and Windows?

Data Science Virtual Machine (DSVM) 是专为开展数据科学构建的 Azure 云平台上的自定义 VM 映像。The Data Science Virtual Machine (DSVM) is a customized VM image on the Azure cloud platform built specifically for doing data science. 它已预安装且已预配了许多热门数据科学工具,可快速构建用于高级分析的智能应用程序。It has many popular data science tools preinstalled and pre-configured to jump-start building intelligent applications for advanced analytics.

DSVM 在以下环境中可用:The DSVM is available on:

  • Windows Server 2019Windows Server 2019
  • Ubuntu 18.04 LTSUbuntu 18.04 LTS

与 Azure 机器学习进行比较Comparison with Azure Machine Learning

DSVM 是一种用于数据科学的自定义 VM 映像,而 Azure 机器学习 (AzureML) 是一个端到端平台,其中包含:The DSVM is a customized VM image for Data Science but Azure Machine Learning (AzureML) is an end-to-end platform that encompasses:

  • 完全托管计算Fully Managed Compute
    • 计算实例Compute Instances
    • 用于分布式 ML 任务的计算群集Compute Clusters for distributed ML tasks
    • 用于实时评分的推理群集Inference Clusters for real-time scoring
  • 数据存储(例如 Blob、ADLS Gen2、SQL DB)Datastores (for example Blob, ADLS Gen2, SQL DB)
  • 试验跟踪Experiment tracking
  • 模型管理Model management
  • 笔记本Notebooks
  • 环境(管理 conda 和 R 依赖项)Environments (manage conda and R dependencies)
  • 标记Labeling
  • 管道(自动化端到端数据科学工作流)Pipelines (automate End-to-End Data science workflows)

与 AzureML 计算实例进行比较Comparison with AzureML Compute Instances

Azure 机器学习计算实例是完全配置的托管 VM 映像,而 DSVM 是非托管的 VM 。Azure Machine Learning Compute Instances are a fully configured and managed VM image whereas the DSVM is an unmanaged VM.

下面详细介绍了这两种产品之间的主要区别:The key differences between these two product offerings are detailed below:

功能Feature 数据科学Data Science
VMVM
AzureMLAzureML
计算实例Compute Instance
完全托管Fully Managed No Yes
语言支持Language Support Python、R、Julia、SQL、C#、Python, R, Julia, SQL, C#,
Java、Node.js、F#Java, Node.js, F#
Python 和 RPython and R
操作系统Operating System UbuntuUbuntu
WindowsWindows
UbuntuUbuntu
已预配置的 GPU 选项Pre-Configured GPU Option Yes Yes
纵向扩展选项Scale up option Yes Yes
SSH 访问权限SSH Access Yes Yes
RDP 访问权限RDP Access Yes No
内置Built-in
托管的 NotebooksHosted Notebooks
No
(需要其他配置)(requires additional configuration)
Yes
内置 SSOBuilt-in SSO No
(需要其他配置)(requires additional configuration)
Yes
内置协作Built-in Collaboration No Yes
预安装的工具Pre-installed Tools Jupyter(lab)、RStudio Server、VSCode、Jupyter(lab), RStudio Server, VSCode,
Visual Studio、PyCharm、Juno、Visual Studio, PyCharm, Juno,
Power BI Desktop、SSMS、Power BI Desktop, SSMS,
Microsoft Office 365、Apache DrillMicrosoft Office 365, Apache Drill
Jupyter(lab)Jupyter(lab)
RStudio ServerRStudio Server

示例用例Sample Use Cases

下面演示了 DSVM 客户的一些常见用例。Below we illustrate some common use cases for DSVM customers.

短期实验和评估Short-term experimentation and evaluation

可以使用 DSVM,专门参考我们发布的一些示例和演练来评估或学习新的数据科学工具You can use the DSVM to evaluate or learn new data science tools, especially by going through some of our published samples and walkthroughs.

使用 GPU 进行深度学习Deep learning with GPUs

在 DSVM 中,训练模型可以使用基于图形处理单元 (GPU) 的硬件上的深度学习算法。In the DSVM, your training models can use deep learning algorithms on hardware that's based on graphics processing units (GPUs). 利用 Azure 平台的 VM 缩放功能,DSVM 可帮助根据需要在云中使用基于 GPU 的硬件。By taking advantage of the VM scaling capabilities of the Azure platform, the DSVM helps you use GPU-based hardware in the cloud according to your needs. 若要训练大型模型或者在保留相同 OS 磁盘的同时进行高速计算,可以切换到基于 GPU 的 VM。You can switch to a GPU-based VM when you're training large models, or when you need high-speed computations while keeping the same OS disk. 可在 DSVM 中选择启用了 N 系列 GPU 的任意虚拟机 SKU。You can choose any of the N series GPUs enabled virtual machine SKUs with DSVM. 请注意,Azure 免费帐户不支持启用了 GPU 的虚拟机 SKU。Note GPU enabled virtual machine SKUs are not supported on Azure free accounts.

Windows 版的 DSVM 预安装了 GPU 驱动程序、框架和 GPU 版本的深度学习框架。The Windows editions of the DSVM come pre-installed with GPU drivers, frameworks, and GPU versions of deep learning frameworks. Linux 版的 Ubuntu DSVM 上启用了基于 GPU 的深度学习。On the Linux editions, deep learning on GPUs is enabled on the Ubuntu DSVMs.

还可以将 Ubuntu 或 Windows 版本的 DSVM 部署到不基于 GPU 的 Azure 虚拟机。You can also deploy the Ubuntu or Windows editions of the DSVM to an Azure virtual machine that isn't based on GPUs. 在这种情况下,所有深度学习框架都将回退到 CPU 模式。In this case, all the deep learning frameworks will fall back to the CPU mode.

详细了解可用的深度学习和 AI 框架Learn more about available deep learning and AI frameworks.

数据科学训练和培训Data science training and education

教授数据科学课程的企业培训师和教师通常提供虚拟机映像。Enterprise trainers and educators who teach data science classes usually provide a virtual machine image. 该映像确保学员具有一致的设置且示例以可预测方式工作。The image ensures students have a consistent setup and that the samples work predictably.

DSVM 创建可缓解支持和不兼容性挑战的一致设置的按需环境。The DSVM creates an on-demand environment with a consistent setup that eases the support and incompatibility challenges. 这些环境需要频繁生成,特别是短期培训课程的情况从中获益极大。Cases where these environments need to be built frequently, especially for shorter training classes, benefit substantially.

DSVM 中包含哪些组件?What's included on the DSVM?

此处查看 Windows 和 Linux DSVM 上的完整工具列表。See a full list of tools on both the Windows and Linux DSVM's here.

后续步骤Next steps

通过以下文章,了解详细信息:Learn more with these articles: