Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Azure Data Science Virtual Machines (DSVMs) have a rich set of tools and libraries for machine learning. These resources are available in popular languages, such as Python, R, and Julia.
The DSVM supports these machine-learning tools and libraries:
For a full reference, visit Azure Machine Learning SDK for Python.
Category | Value |
---|---|
What is it? | You can use the Azure Machine Learning cloud service to develop and deploy machine-learning models. You can use the Python SDK to track your models as you build, train, scale, and manage them. Deploy models as containers, and run them in the cloud, on-premises, or on Azure IoT Edge. |
Supported editions | Windows (conda environment: AzureML), Linux (conda environment: py36) |
Typical uses | General machine-learning platform |
How is it configured or installed? | Installed with GPU support |
How to use or run it | As a Python SDK and in the Azure CLI. Activate to the conda environment AzureML on the Windows edition or activate to py36 on the Linux edition. |
Link to samples | Find sample Jupyter notebooks in the AzureML directory, under notebooks. |
Category | Value |
---|---|
What is it? | An open-source AI platform that supports distributed, fast, in-memory, scalable machine learning. |
Supported versions | Linux |
Typical uses | General-purpose distributed, scalable machine learning |
How is it configured or installed? | H2O is installed in /dsvm/tools/h2o . |
How to use or run it | Connect to the VM with X2Go. Start a new terminal, and run java -jar /dsvm/tools/h2o/current/h2o.jar . Then, start a web browser and connect to http://localhost:54321 . |
Link to samples | Find samples on the VM in Jupyter, under the h2o directory. |
There are several other machine-learning libraries on DSVMs - for example, the popular scikit-learn
package that's part of the Anaconda Python distribution for DSVMs. For a list of packages available in Python, R, and Julia, run the respective package managers.
Category | Value |
---|---|
What is it? | A fast, distributed, high-performance gradient-boosting (GBDT, GBRT, GBM, or MART) framework based on decision tree algorithms. Machine-learning tasks - ranking, classification, etc. - use it. |
Supported versions | Windows, Linux |
Typical uses | General-purpose gradient-boosting framework |
How is it configured or installed? | LightGBM is installed as a Python package on Windows. On Linux, the command-line executable is located in /opt/LightGBM/lightgbm . The R package is installed, and Python packages are installed. |
Link to samples | LightGBM guide |
Category | Value |
---|---|
What is it? | A graphical user interface for data mining that uses R. |
Supported editions | Windows, Linux |
Typical uses | General UI data-mining tool for R |
How to use or run it | As a UI tool. On Windows, start a command prompt, run R, and then inside R, run rattle() . On Linux, connect with X2Go, start a terminal, run R, and then inside R, run rattle() . |
Link to samples | Rattle |
Category | Value |
---|---|
What is it? | A fast, open-source, out-of-core learning system library |
Supported editions | Windows, Linux |
Typical uses | General machine-learning library |
How is it configured or installed? | Windows: msi installer Linux: apt-get |
How to use or run it | As an on-path command-line tool (C:\Program Files\VowpalWabbit\vw.exe on Windows, /usr/bin/vw on Linux) |
Link to samples | VowPal Wabbit samples |
Category | Value |
---|---|
What is it? | A collection of machine-learning algorithms for data-mining tasks. You can either apply the algorithms directly, or call them from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. |
Supported editions | Windows, Linux |
Typical uses | General machine-learning tool |
How to use or run it | On Windows, search for Weka on the Start menu. On Linux, sign in with X2Go, and then go to Applications > Development > Weka. |
Link to samples | Weka samples |
Category | Value |
---|---|
What is it? | A fast, portable, and distributed gradient-boosting (GBDT, GBRT, or GBM) library for Python, R, Java, Scala, C++, and more. It runs on a single machine, and on Apache Hadoop and Spark. |
Supported editions | Windows, Linux |
Typical uses | General machine-learning library |
How is it configured or installed? | Installed with GPU support |
How to use or run it | As a Python library (2.7 and 3.6+), R package, and on-path command-line tool (C:\dsvm\tools\xgboost\bin\xgboost.exe for Windows and /dsvm/tools/xgboost/xgboost for Linux) |
Links to samples | Samples are included on the VM, in /dsvm/tools/xgboost/demo on Linux, and C:\dsvm\tools\xgboost\demo on Windows. |