快速入门:使用 ARM 模板创建 Ubuntu Data Science Virtual Machine

本快速入门将介绍如何使用 Azure 资源管理器模板(ARM 模板)创建 Ubuntu Data Science Virtual Machine。 Data Science Virtual Machine 是基于云的虚拟机,预加载了一套数据科学和机器学习框架及工具。 当部署在 GPU 驱动的计算资源上时,所有工具和库都配置为使用 GPU。

资源管理器模板是定义项目基础结构和配置的 JavaScript 对象表示法 (JSON) 文件。 模板使用声明性语法。 在声明性语法中,你可以在不编写创建部署的编程命令序列的情况下,描述预期部署。

如果你的环境满足先决条件,并且你熟悉如何使用 ARM 模板,请选择“部署到 Azure”按钮。 Azure 门户中会打开模板。

部署到 Azure

先决条件

  • Azure 订阅。 如果没有 Azure 订阅,请在开始前创建一个试用版订阅

  • 若要从本地环境使用本文档中的 CLI 命令,需要使用 Azure CLI

查看模板

本快速入门中使用的模板来自 Azure 快速启动模板

{
  "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
  "contentVersion": "1.0.0.0",
  "metadata": {
    "_generator": {
      "name": "bicep",
      "version": "0.8.9.13224",
      "templateHash": "4895680407304578048"
    }
  },
  "parameters": {
    "adminUsername": {
      "type": "string",
      "metadata": {
        "description": "Username for Administrator Account"
      }
    },
    "vmName": {
      "type": "string",
      "defaultValue": "vmName",
      "metadata": {
        "description": "The name of you Virtual Machine."
      }
    },
    "location": {
      "type": "string",
      "defaultValue": "[resourceGroup().location]",
      "metadata": {
        "description": "Location for all resources."
      }
    },
    "cpu_gpu": {
      "type": "string",
      "defaultValue": "CPU-4GB",
      "allowedValues": [
        "CPU-4GB",
        "CPU-7GB",
        "CPU-8GB",
        "CPU-14GB",
        "CPU-16GB",
        "GPU-56GB"
      ],
      "metadata": {
        "description": "Choose between CPU or GPU processing"
      }
    },
    "virtualNetworkName": {
      "type": "string",
      "defaultValue": "vNet",
      "metadata": {
        "description": "Name of the VNET"
      }
    },
    "subnetName": {
      "type": "string",
      "defaultValue": "subnet",
      "metadata": {
        "description": "Name of the subnet in the virtual network"
      }
    },
    "networkSecurityGroupName": {
      "type": "string",
      "defaultValue": "SecGroupNet",
      "metadata": {
        "description": "Name of the Network Security Group"
      }
    },
    "authenticationType": {
      "type": "string",
      "defaultValue": "sshPublicKey",
      "allowedValues": [
        "sshPublicKey",
        "password"
      ],
      "metadata": {
        "description": "Type of authentication to use on the Virtual Machine. SSH key is recommended."
      }
    },
    "adminPasswordOrKey": {
      "type": "secureString",
      "metadata": {
        "description": "SSH Key or password for the Virtual Machine. SSH key is recommended."
      }
    }
  },
  "variables": {
    "networkInterfaceName": "[format('{0}NetInt', parameters('vmName'))]",
    "virtualMachineName": "[parameters('vmName')]",
    "publicIpAddressName": "[format('{0}PublicIP', parameters('vmName'))]",
    "subnetRef": "[resourceId('Microsoft.Network/virtualNetworks/subnets', parameters('virtualNetworkName'), parameters('subnetName'))]",
    "nsgId": "[resourceId('Microsoft.Network/networkSecurityGroups', parameters('networkSecurityGroupName'))]",
    "osDiskType": "StandardSSD_LRS",
    "storageAccountName": "[format('storage{0}', uniqueString(resourceGroup().id))]",
    "storageAccountType": "Standard_LRS",
    "storageAccountKind": "Storage",
    "vmSize": {
      "CPU-4GB": "Standard_B2s",
      "CPU-7GB": "Standard_D2s_v3",
      "CPU-8GB": "Standard_D2s_v3",
      "CPU-14GB": "Standard_D4s_v3",
      "CPU-16GB": "Standard_D4s_v3",
      "GPU-56GB": "Standard_NC6_Promo"
    },
    "linuxConfiguration": {
      "disablePasswordAuthentication": true,
      "ssh": {
        "publicKeys": [
          {
            "path": "[format('/home/{0}/.ssh/authorized_keys', parameters('adminUsername'))]",
            "keyData": "[parameters('adminPasswordOrKey')]"
          }
        ]
      }
    }
  },
  "resources": [
    {
      "type": "Microsoft.Network/networkInterfaces",
      "apiVersion": "2021-05-01",
      "name": "[variables('networkInterfaceName')]",
      "location": "[parameters('location')]",
      "properties": {
        "ipConfigurations": [
          {
            "name": "ipconfig1",
            "properties": {
              "subnet": {
                "id": "[variables('subnetRef')]"
              },
              "privateIPAllocationMethod": "Dynamic",
              "publicIPAddress": {
                "id": "[resourceId('Microsoft.Network/publicIPAddresses', variables('publicIpAddressName'))]"
              }
            }
          }
        ],
        "networkSecurityGroup": {
          "id": "[variables('nsgId')]"
        }
      },
      "dependsOn": [
        "[resourceId('Microsoft.Network/networkSecurityGroups', parameters('networkSecurityGroupName'))]",
        "[resourceId('Microsoft.Network/publicIPAddresses', variables('publicIpAddressName'))]",
        "[resourceId('Microsoft.Network/virtualNetworks', parameters('virtualNetworkName'))]"
      ]
    },
    {
      "type": "Microsoft.Network/networkSecurityGroups",
      "apiVersion": "2021-05-01",
      "name": "[parameters('networkSecurityGroupName')]",
      "location": "[parameters('location')]",
      "properties": {
        "securityRules": [
          {
            "name": "JupyterHub",
            "properties": {
              "priority": 1010,
              "protocol": "Tcp",
              "access": "Allow",
              "direction": "Inbound",
              "sourceAddressPrefix": "*",
              "sourcePortRange": "*",
              "destinationAddressPrefix": "*",
              "destinationPortRange": "8000"
            }
          },
          {
            "name": "RStudioServer",
            "properties": {
              "priority": 1020,
              "protocol": "Tcp",
              "access": "Allow",
              "direction": "Inbound",
              "sourceAddressPrefix": "*",
              "sourcePortRange": "*",
              "destinationAddressPrefix": "*",
              "destinationPortRange": "8787"
            }
          },
          {
            "name": "SSH",
            "properties": {
              "priority": 1030,
              "protocol": "Tcp",
              "access": "Allow",
              "direction": "Inbound",
              "sourceAddressPrefix": "*",
              "sourcePortRange": "*",
              "destinationAddressPrefix": "*",
              "destinationPortRange": "22"
            }
          }
        ]
      }
    },
    {
      "type": "Microsoft.Network/virtualNetworks",
      "apiVersion": "2021-05-01",
      "name": "[parameters('virtualNetworkName')]",
      "location": "[parameters('location')]",
      "properties": {
        "addressSpace": {
          "addressPrefixes": [
            "10.0.0.0/24"
          ]
        },
        "subnets": [
          {
            "name": "[parameters('subnetName')]",
            "properties": {
              "addressPrefix": "10.0.0.0/24",
              "privateEndpointNetworkPolicies": "Enabled",
              "privateLinkServiceNetworkPolicies": "Enabled"
            }
          }
        ]
      }
    },
    {
      "type": "Microsoft.Network/publicIPAddresses",
      "apiVersion": "2021-05-01",
      "name": "[variables('publicIpAddressName')]",
      "location": "[parameters('location')]",
      "sku": {
        "name": "Basic",
        "tier": "Regional"
      },
      "properties": {
        "publicIPAllocationMethod": "Dynamic"
      }
    },
    {
      "type": "Microsoft.Storage/storageAccounts",
      "apiVersion": "2021-08-01",
      "name": "[variables('storageAccountName')]",
      "location": "[parameters('location')]",
      "sku": {
        "name": "[variables('storageAccountType')]"
      },
      "kind": "[variables('storageAccountKind')]"
    },
    {
      "type": "Microsoft.Compute/virtualMachines",
      "apiVersion": "2021-11-01",
      "name": "[format('{0}-{1}', variables('virtualMachineName'), parameters('cpu_gpu'))]",
      "location": "[parameters('location')]",
      "properties": {
        "hardwareProfile": {
          "vmSize": "[variables('vmSize')[parameters('cpu_gpu')]]"
        },
        "storageProfile": {
          "osDisk": {
            "createOption": "FromImage",
            "managedDisk": {
              "storageAccountType": "[variables('osDiskType')]"
            }
          },
          "imageReference": {
            "publisher": "microsoft-dsvm",
            "offer": "ubuntu-1804",
            "sku": "1804-gen2",
            "version": "latest"
          }
        },
        "networkProfile": {
          "networkInterfaces": [
            {
              "id": "[resourceId('Microsoft.Network/networkInterfaces', variables('networkInterfaceName'))]"
            }
          ]
        },
        "osProfile": {
          "computerName": "[variables('virtualMachineName')]",
          "adminUsername": "[parameters('adminUsername')]",
          "adminPassword": "[parameters('adminPasswordOrKey')]",
          "linuxConfiguration": "[if(equals(parameters('authenticationType'), 'password'), json('null'), variables('linuxConfiguration'))]"
        }
      },
      "dependsOn": [
        "[resourceId('Microsoft.Network/networkInterfaces', variables('networkInterfaceName'))]"
      ]
    }
  ],
  "outputs": {
    "adminUsername": {
      "type": "string",
      "value": "[parameters('adminUsername')]"
    }
  }
}

该模板中定义了以下资源:

部署模板

若要通过 Azure CLI 使用该模板,请登录并选择你的订阅(请参阅使用 Azure CLI 登录)。 然后运行:

read -p "Enter the name of the resource group to create:" resourceGroupName &&
read -p "Enter the Azure location (e.g., chinaeast2):" location &&
read -p "Enter the authentication type (must be 'password' or 'sshPublicKey') :" authenticationType &&
read -p "Enter the login name for the administrator account (may not be 'admin'):" adminUsername &&
read -p "Enter administrator account secure string (value of password or ssh public key):" adminPasswordOrKey &&
templateUri="https://raw.githubusercontent.com/Azure/azure-quickstart-templates/master/application-workloads/datascience/vm-ubuntu-DSVM-GPU-or-CPU/azuredeploy.json" &&
az group create --name $resourceGroupName --location "$location" &&
az deployment group create --resource-group $resourceGroupName --template-uri $templateUri --parameters adminUsername=$adminUsername authenticationType=$authenticationType adminPasswordOrKey=$adminPasswordOrKey &&
echo "Press [ENTER] to continue ..." &&
read

运行上述命令时,请输入:

  1. 要创建的包含 DSVM 和关联资源的资源组的名称。
  2. 要在其中进行部署的 Azure 位置。
  3. 要使用的身份验证类型(输入字符串 passwordsshPublicKey)。
  4. 管理员帐户的登录名(此值不能为 admin)。
  5. 帐户的密码或 SSH 公钥的值。

查看已部署的资源

若要查看 Data Science Virtual Machine:

  1. 转到 Azure 门户
  2. 登录。
  3. 选择你刚才创建的资源组。

你将看到资源组的信息:

包含 DSVM 的基本资源组的屏幕截图

单击虚拟机资源转到其信息页面。 可在此处找到 VM 的相关信息,包括连接详细信息。

清理资源

如果不想使用此虚拟机,请删除它。 由于 DSVM 与其他资源(例如存储帐户)关联,因此可能需要删除所创建的整个资源组。 可以使用门户删除资源组,方法是单击“删除”按钮并进行确认。 也可以从 CLI 中使用以下命令删除资源组:

echo "Enter the Resource Group name:" &&
read resourceGroupName &&
az group delete --name $resourceGroupName &&
echo "Press [ENTER] to continue ..."

后续步骤

在本快速入门中,你已通过 ARM 模板创建了 Data Science Virtual Machine。