使用 Azure CLI 创建 HDInsight 群集Create HDInsight clusters using the Azure CLI
本文介绍了使用 Azure CLI 创建 HDInsight 3.6 群集的相关步骤。The steps in this document walk-through creating a HDInsight 3.6 cluster using the Azure CLI.
警告
HDInsight 群集是基于分钟按比例计费,而不管用户是否使用它们。Billing for HDInsight clusters is prorated per minute, whether you use them or not. 请务必在使用完群集之后将其删除。Be sure to delete your cluster after you finish using it. 请参阅如何删除 HDInsight 群集。See how to delete an HDInsight cluster.
如果没有 Azure 试用版订阅,请在开始前创建一个试用版订阅。If you don't have an Azure trail subscription, create a trial subscription before you begin.
先决条件Prerequisites
- 如果需要,请安装 Azure CLI 来运行 CLI 参考命令。If you prefer, install the Azure CLI to run CLI reference commands.
- 如果使用的是本地安装,请通过 Azure CLI 使用 az login 命令登录。If you're using a local install, sign in with Azure CLI by using the az login command. 若要完成身份验证过程,请遵循终端中显示的步骤。To finish the authentication process, follow the steps displayed in your terminal. 有关其他登录选项,请参阅使用 Azure CLI 登录。See Sign in with Azure CLI for additional sign-in options.
- 出现提示时,请在首次使用时安装 Azure CLI 扩展。When you're prompted, install Azure CLI extensions on first use. 有关扩展详细信息,请参阅使用 Azure CLI 的扩展。For more information about extensions, see Use extensions with Azure CLI.
- 运行 az version 以查找安装的版本和依赖库。Run az version to find the version and dependent libraries that are installed. 若要升级到最新版本,请运行 az upgrade。To upgrade to the latest version, run az upgrade.
创建群集Create a cluster
登录到 Azure 订阅。Login to your Azure subscription.
az login # If you have multiple subscriptions, set the one to use # az account set --subscription "SUBSCRIPTIONID"
设置环境变量。Set environment variables. 本文中的变量用法基于 Bash。The use of variables in this article is based on Bash. 在其他环境中需要进行细微的更改。Slight variations will be needed for other environments. 有关用于群集创建的可能参数的完整列表,请参见 az-hdinsight-create。See az-hdinsight-create for a complete list of possible parameters for cluster creation.
参数Parameter 说明Description --workernode-count
群集中的工作器节点数。The number of worker nodes in the cluster. 本文使用变量 clusterSizeInNodes
作为传递给--workernode-count
的值。This article uses the variableclusterSizeInNodes
as the value passed to--workernode-count
.--version
HDInsight 群集版本。The HDInsight cluster version. 本文使用变量 clusterVersion
作为传递给--version
的值。This article uses the variableclusterVersion
as the value passed to--version
. 另请参阅:支持的 HDInsight 版本。See also: Supported HDInsight versions.--type
HDInsight 群集的类型,如:hadoop、interactivehive、hbase、Kafka、storm、spark、rserver、mlservices。Type of HDInsight cluster, like: hadoop, interactivehive, hbase, kafka, storm, spark, rserver, mlservices. 本文使用变量 clusterType
作为传递给--type
的值。This article uses the variableclusterType
as the value passed to--type
. 另请参阅:群集类型和配置。See also: Cluster types and configuration.--component-version
各种 Hadoop 组件的版本,采用“component=version”格式的空格分隔版本。The versions of various Hadoop components, in space-separated versions in 'component=version' format. 本文使用变量 componentVersion
作为传递给--component-version
的值。This article uses the variablecomponentVersion
as the value passed to--component-version
. 另请参阅:Hadoop 组件。See also: Hadoop components.将
RESOURCEGROUPNAME
、LOCATION
、CLUSTERNAME
、STORAGEACCOUNTNAME
和PASSWORD
替换为所需的值。ReplaceRESOURCEGROUPNAME
,LOCATION
,CLUSTERNAME
,STORAGEACCOUNTNAME
, andPASSWORD
with the desired values. 根据需要更改其他变量的值。Change values for the other variables as desired. 然后输入 CLI 命令。Then enter the CLI commands.export resourceGroupName=RESOURCEGROUPNAME export location=LOCATION export clusterName=CLUSTERNAME export AZURE_STORAGE_ACCOUNT=STORAGEACCOUNTNAME export httpCredential='PASSWORD' export sshCredentials='PASSWORD' export AZURE_STORAGE_CONTAINER=$clusterName export clusterSizeInNodes=1 export clusterVersion=3.6 export clusterType=hadoop export componentVersion=Hadoop=2.7
输入以下命令来创建资源组:Create the resource group by entering the command below:
az group create \ --location $location \ --name $resourceGroupName
有关有效位置的列表,请使用
az account list-locations
命令,并使用name
值中的位置之一。For a list of valid locations, use theaz account list-locations
command, and then use one of the locations from thename
value.输入以下命令来创建 Azure 存储帐户:Create an Azure storage account by entering the command below:
# Note: kind BlobStorage is not available as the default storage account. az storage account create \ --name $AZURE_STORAGE_ACCOUNT \ --resource-group $resourceGroupName \ --https-only true \ --kind StorageV2 \ --location $location \ --sku Standard_LRS
通过输入以下命令从 Azure 存储帐户中提取主密钥,然后将其存储在一个变量中:Extract the primary key from the Azure storage account and store it in a variable by entering the command below:
export AZURE_STORAGE_KEY=$(az storage account keys list \ --account-name $AZURE_STORAGE_ACCOUNT \ --resource-group $resourceGroupName \ --query [0].value -o tsv)
输入以下命令来创建 Azure 存储容器:Create an Azure storage container by entering the command below:
az storage container create \ --name $AZURE_STORAGE_CONTAINER \ --account-key $AZURE_STORAGE_KEY \ --account-name $AZURE_STORAGE_ACCOUNT
输入以下命令来创建 HDInsight 群集:Create the HDInsight cluster by entering the following command:
az hdinsight create \ --name $clusterName \ --resource-group $resourceGroupName \ --type $clusterType \ --component-version $componentVersion \ --http-password $httpCredential \ --http-user admin \ --location $location \ --workernode-count $clusterSizeInNodes \ --ssh-password $sshCredentials \ --ssh-user sshuser \ --storage-account $AZURE_STORAGE_ACCOUNT \ --storage-account-key $AZURE_STORAGE_KEY \ --storage-container $AZURE_STORAGE_CONTAINER \ --version $clusterVersion
重要
HDInsight 群集具有各种不同的类型,与该群集进行优化的工作负荷或技术相对应。HDInsight clusters come in various types, which correspond to the workload or technology that the cluster is tuned for. 不支持在一个群集上创建合并了多个类型(如 Storm 和 HBase)的群集。There is no supported method to create a cluster that combines multiple types, such as Storm and HBase on one cluster.
可能需要几分钟时间才能完成群集创建过程。It may take several minutes for the cluster creation process to complete. 通常大约为 15 分钟。Usually around 15.
清理资源Clean up resources
完成本文后,可以删除群集。After you complete the article, you may want to delete the cluster. 有了 HDInsight,便可以将数据存储在 Azure 存储中,因此可以在群集不用时安全地删除群集。With HDInsight, your data is stored in Azure Storage, so you can safely delete a cluster when it is not in use. 此外,还需要支付 HDInsight 群集费用,即使未使用。You are also charged for an HDInsight cluster, even when it is not in use. 由于群集费用高于存储空间费用数倍,因此在不使用群集时将其删除可以节省费用。Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they are not in use.
输入以下命令中的全部或部分来删除资源:Enter all or some of the following commands to remove resources:
# Remove cluster
az hdinsight delete \
--name $clusterName \
--resource-group $resourceGroupName
# Remove storage container
az storage container delete \
--account-name $AZURE_STORAGE_ACCOUNT \
--name $AZURE_STORAGE_CONTAINER
# Remove storage account
az storage account delete \
--name $AZURE_STORAGE_ACCOUNT \
--resource-group $resourceGroupName
# Remove resource group
az group delete \
--name $resourceGroupName
故障排除Troubleshoot
如果在创建 HDInsight 群集时遇到问题,请参阅访问控制要求。If you run into issues with creating HDInsight clusters, see access control requirements.
后续步骤Next steps
使用 Azure CLI 成功创建 HDInsight 群集后,请参考以下主题来了解如何使用群集:Now that you have successfully created an HDInsight cluster using the Azure CLI, use the following to learn how to work with your cluster:
Apache Hadoop 群集Apache Hadoop clusters
- 将 Apache Hive 和 HDInsight 配合使用Use Apache Hive with HDInsight
- 将 MapReduce 与 HDInsight 配合使用Use MapReduce with HDInsight
Apache HBase 群集Apache HBase clusters
- HDInsight 中的 Apache HBase 入门Get started with Apache HBase on HDInsight
- 为 Apache HBase on HDInsight 开发 Java 应用程序Develop Java applications for Apache HBase on HDInsight