Create HDInsight clusters using the Azure CLI
The steps in this document walk-through creating a HDInsight 4.0 cluster using the Azure CLI.
Warning
Billing for HDInsight clusters is prorated per minute, whether you use them or not. Be sure to delete your cluster after you finish using it. See how to delete an HDInsight cluster.
If you don't have an Azure trial subscription, create an Azure trial subscription before you begin.
Prerequisites
If you prefer to run CLI reference commands locally, install the Azure CLI. If you're running on Windows or macOS, consider running Azure CLI in a Docker container. For more information, see How to run the Azure CLI in a Docker container.
If you're using a local installation, sign in to the Azure CLI by using the az login command. To finish the authentication process, follow the steps displayed in your terminal. For other sign-in options, see Sign in with the Azure CLI.
When you're prompted, install the Azure CLI extension on first use. For more information about extensions, see Use extensions with the Azure CLI.
Run az version to find the version and dependent libraries that are installed. To upgrade to the latest version, run az upgrade.
Create a cluster
Login to your Azure subscription.
az cloud set -n AzureChinaCloud az login # az cloud set -n AzureCloud //means return to Public Azure. # If you have multiple subscriptions, set the one to use # az account set --subscription "SUBSCRIPTIONID"
Set environment variables. The use of variables in this article is based on Bash. Slight variations are needed for other environments. See az-hdinsight-create for a complete list of possible parameters for cluster creation.
Parameter Description --workernode-count
The number of worker nodes in the cluster. This article uses the variable clusterSizeInNodes
as the value passed to--workernode-count
.--version
The HDInsight cluster version. This article uses the variable clusterVersion
as the value passed to--version
. See also: Supported HDInsight versions.--type
Type of HDInsight cluster, like: hadoop, interactive hive, hbase, kafka, spark, rserver
,mlservices
. This article uses the variableclusterType
as the value passed to--type
. See also: Cluster types and configuration.--component-version
The versions of various Hadoop components, in space-separated versions in 'component=version' format. This article uses the variable componentVersion
as the value passed to--component-version
. See also: Hadoop components.Replace
RESOURCEGROUPNAME
,LOCATION
,CLUSTERNAME
,STORAGEACCOUNTNAME
, andPASSWORD
with the desired values. Change values for the other variables as desired. Then enter the CLI commands.export resourceGroupName=RESOURCEGROUPNAME export location=LOCATION export clusterName=CLUSTERNAME export AZURE_STORAGE_ACCOUNT=STORAGEACCOUNTNAME export httpCredential='PASSWORD' export sshCredentials='PASSWORD' export AZURE_STORAGE_CONTAINER=$clusterName export clusterSizeInNodes=1 export clusterVersion=4.0 export clusterType=hadoop export componentVersion=Hadoop=3.1
Create the resource group by entering the following command:
az group create \ --location $location \ --name $resourceGroupName
For a list of valid locations, use the
az account list-locations
command, and then use one of the locations from thename
value.Create an Azure Storage account by entering the following command:
# Note: kind BlobStorage is not available as the default storage account. az storage account create \ --name $AZURE_STORAGE_ACCOUNT \ --resource-group $resourceGroupName \ --https-only true \ --kind StorageV2 \ --location $location \ --sku Standard_LRS
Extract the primary key from the Azure Storage account and store it in a variable by entering the following command:
export AZURE_STORAGE_KEY=$(az storage account keys list \ --account-name $AZURE_STORAGE_ACCOUNT \ --resource-group $resourceGroupName \ --query [0].value -o tsv)
Create an Azure Storage container by entering the following command:
az storage container create \ --name $AZURE_STORAGE_CONTAINER \ --account-key $AZURE_STORAGE_KEY \ --account-name $AZURE_STORAGE_ACCOUNT
Create the HDInsight cluster by entering the following command:
az hdinsight create \ --name $clusterName \ --resource-group $resourceGroupName \ --type $clusterType \ --component-version $componentVersion \ --http-password $httpCredential \ --http-user admin \ --location $location \ --workernode-count $clusterSizeInNodes \ --ssh-password $sshCredentials \ --ssh-user sshuser \ --storage-account $AZURE_STORAGE_ACCOUNT \ --storage-account-key $AZURE_STORAGE_KEY \ --storage-container $AZURE_STORAGE_CONTAINER \ --version $clusterVersion
Important
HDInsight clusters come in various types, which correspond to the workload or technology that the cluster is tuned for. There is no supported method to create a cluster that combines multiple types, such as HBase on one cluster.
It may take several minutes for the cluster creation process to complete. Usually around 15.
Clean up resources
After you complete the article, you may want to delete the cluster. With HDInsight, your data is stored in Azure Storage, so you can safely delete a cluster when it isn't in use. You're also charged for an HDInsight cluster, even when it's not in use. Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they aren't in use.
Enter all or some of the following commands to remove resources:
# Remove cluster
az hdinsight delete \
--name $clusterName \
--resource-group $resourceGroupName
# Remove storage container
az storage container delete \
--account-name $AZURE_STORAGE_ACCOUNT \
--name $AZURE_STORAGE_CONTAINER
# Remove storage account
az storage account delete \
--name $AZURE_STORAGE_ACCOUNT \
--resource-group $resourceGroupName
# Remove resource group
az group delete \
--name $resourceGroupName
Troubleshoot
If you run into issues with creating HDInsight clusters, see access control requirements.
Next steps
Now that you've successfully created an HDInsight cluster using the Azure CLI, use the following to learn how to work with your cluster: