使用 Azure CLI 创建 HDInsight 群集Create HDInsight clusters using the Azure CLI

本文介绍了使用 Azure CLI 创建 HDInsight 3.6 群集的相关步骤。The steps in this document walk-through creating a HDInsight 3.6 cluster using the Azure CLI.

Warning

HDInsight 群集是基于分钟按比例计费,而不管用户是否使用它们。Billing for HDInsight clusters is prorated per minute, whether you use them or not. 请务必在使用完群集之后将其删除。Be sure to delete your cluster after you finish using it. 请参阅如何删除 HDInsight 群集See how to delete an HDInsight cluster.

如果没有 Azure 订阅,可在开始前创建一个试用帐户If you don't have an Azure subscription, create a trial account before you begin.

先决条件Prerequisites

Azure CLI。Azure CLI. 如果尚未安装 Azure CLI,请参阅安装 Azure CLI 来了解步骤。If you haven't installed the Azure CLI, see Install the Azure CLI for steps.

创建群集Create a cluster

  1. 登录到 Azure 订阅。Login to your Azure subscription.

    az login
    
    # If you have multiple subscriptions, set the one to use
    # az account set --subscription "SUBSCRIPTIONID"
    
  2. 设置环境变量。Set environment variables. 本文中的变量用法基于 Bash。The use of variables in this article is based on Bash. 在其他环境中需要进行细微的更改。Slight variations will be needed for other environments. 有关用于群集创建的可能参数的完整列表,请参见 az-hdinsight-createSee az-hdinsight-create for a complete list of possible parameters for cluster creation.

    参数Parameter 说明Description
    --size 群集中的工作器节点数。The number of worker nodes in the cluster. 本文使用变量 clusterSizeInNodes 作为传递给 --size 的值。This article uses the variable clusterSizeInNodes as the value passed to --size.
    --version HDInsight 群集版本。The HDInsight cluster version. 本文使用变量 clusterVersion 作为传递给 --version 的值。This article uses the variable clusterVersion as the value passed to --version. 另请参阅:支持的 HDInsight 版本See also: Supported HDInsight versions.
    --type HDInsight 群集的类型,如:hadoop、interactivehive、hbase、Kafka、storm、spark。Type of HDInsight cluster, like: hadoop, interactivehive, hbase, kafka, storm, spark. 本文使用变量 clusterType 作为传递给 --type 的值。This article uses the variable clusterType as the value passed to --type. 另请参阅:群集类型和配置See also: Cluster types and configuration.
    --component-version 各种 Hadoop 组件的版本,采用“component=version”格式的空格分隔版本。The versions of various Hadoop components, in space-separated versions in 'component=version' format. 本文使用变量 componentVersion 作为传递给 --component-version 的值。This article uses the variable componentVersion as the value passed to --component-version. 另请参阅:Hadoop 组件See also: Hadoop components.

    RESOURCEGROUPNAMELOCATIONCLUSTERNAMESTORAGEACCOUNTNAMEPASSWORD 替换为所需的值。Replace RESOURCEGROUPNAME, LOCATION, CLUSTERNAME, STORAGEACCOUNTNAME, and PASSWORD with the desired values. 根据需要更改其他变量的值。Change values for the other variables as desired. 然后输入 CLI 命令。Then enter the CLI commands.

    export resourceGroupName=RESOURCEGROUPNAME
    export location=LOCATION
    export clusterName=CLUSTERNAME
    export AZURE_STORAGE_ACCOUNT=STORAGEACCOUNTNAME
    export httpCredential='PASSWORD'
    export sshCredentials='PASSWORD'
    
    export AZURE_STORAGE_CONTAINER=$clusterName
    export clusterSizeInNodes=1
    export clusterVersion=3.6
    export clusterType=hadoop
    export componentVersion=Hadoop=2.7
    
  3. 输入以下命令来创建资源组Create the resource group by entering the command below:

    az group create \
        --location $location \
        --name $resourceGroupName
    

    有关有效位置的列表,请使用 az account list-locations 命令,并使用 name 值中的位置之一。For a list of valid locations, use the az account list-locations command, and then use one of the locations from the name value.

  4. 输入以下命令来创建 Azure 存储帐户Create an Azure storage account by entering the command below:

    # Note: kind BlobStorage is not available as the default storage account.
    az storage account create \
        --name $AZURE_STORAGE_ACCOUNT \
        --resource-group $resourceGroupName \
        --https-only true \
        --kind StorageV2 \
        --location $location \
        --sku Standard_LRS
    
  5. 通过输入以下命令从 Azure 存储帐户中提取主密钥,然后将其存储在一个变量中:Extract the primary key from the Azure storage account and store it in a variable by entering the command below:

    export AZURE_STORAGE_KEY=$(az storage account keys list \
        --account-name $AZURE_STORAGE_ACCOUNT \
        --resource-group $resourceGroupName \
        --query [0].value -o tsv)
    
  6. 输入以下命令来创建 Azure 存储容器Create an Azure storage container by entering the command below:

    az storage container create \
        --name $AZURE_STORAGE_CONTAINER \
        --account-key $AZURE_STORAGE_KEY \
        --account-name $AZURE_STORAGE_ACCOUNT
    
  7. 输入以下命令来创建 HDInsight 群集Create the HDInsight cluster by entering the following command:

    az hdinsight create \
        --name $clusterName \
        --resource-group $resourceGroupName \
        --type $clusterType \
        --component-version $componentVersion \
        --http-password $httpCredential \
        --http-user admin \
        --location $location \
        --size $clusterSizeInNodes \
        --ssh-password $sshCredentials \
        --ssh-user sshuser \
        --storage-account $AZURE_STORAGE_ACCOUNT \
        --storage-account-key $AZURE_STORAGE_KEY \
        --storage-default-container $AZURE_STORAGE_CONTAINER \
        --version $clusterVersion
    

    Important

    HDInsight 群集具有各种不同的类型,与该群集进行优化的工作负荷或技术相对应。HDInsight clusters come in various types, which correspond to the workload or technology that the cluster is tuned for. 不支持在一个群集上创建合并了多个类型(如 Storm 和 HBase)的群集。There is no supported method to create a cluster that combines multiple types, such as Storm and HBase on one cluster.

    可能需要几分钟时间才能完成群集创建过程。It may take several minutes for the cluster creation process to finish. 通常大约为 15 分钟。Usually around 15.

清理资源Clean up resources

完成本文后,可以删除群集。After you complete the article, you may want to delete the cluster. 有了 HDInsight,便可以将数据存储在 Azure 存储中,因此可以在群集不用时安全地删除群集。With HDInsight, your data is stored in Azure Storage, so you can safely delete a cluster when it is not in use. 此外,还需要支付 HDInsight 群集费用,即使未使用。You are also charged for an HDInsight cluster, even when it is not in use. 由于群集费用高于存储空间费用数倍,因此在不使用群集时将其删除可以节省费用。Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they are not in use.

输入以下命令中的全部或部分来删除资源:Enter all or some of the following commands to remove resources:

# Remove cluster
az hdinsight delete \
    --name $clusterName \
    --resource-group $resourceGroupName

# Remove storage container
az storage container delete \
    --account-name $AZURE_STORAGE_ACCOUNT \
    --name $AZURE_STORAGE_CONTAINER

# Remove storage account
az storage account delete \
    --name $AZURE_STORAGE_ACCOUNT \
    --resource-group $resourceGroupName

# Remove resource group
az group delete \
    --name $resourceGroupName

故障排除Troubleshoot

如果在创建 HDInsight 群集时遇到问题,请参阅访问控制要求If you run into issues with creating HDInsight clusters, see access control requirements.

后续步骤Next steps

使用 Azure CLI 成功创建 HDInsight 群集后,请参考以下主题来了解如何使用群集:Now that you have successfully created an HDInsight cluster using the Azure CLI, use the following to learn how to work with your cluster:

Apache Hadoop 群集Apache Hadoop clusters

Apache HBase 群集Apache HBase clusters

Apache Storm 群集Apache Storm clusters