快速入门:使用资源管理器模板在 Azure HDInsight 中创建 Apache Hadoop 群集Quickstart: Create Apache Hadoop cluster in Azure HDInsight using Resource Manager template

在本快速入门中,你将了解如何使用资源管理器模板在 Azure HDInsight 中创建 Apache Hadoop 群集。In this quickstart, you learn how to create an Apache Hadoop cluster in Azure HDInsight using a Resource Manager template.

可以在 Azure 快速入门模板中查看类似的模板。Similar templates can be viewed at Azure quickstart templates. 可以在此处找到模板参考。The template reference can be found here. 也可以使用 Azure 门户创建群集。You can also create a cluster using the Azure portal.

目前,HDInsight 附带七个不同的群集类型Currently HDInsight comes with seven different cluster types. 每个群集类型都支持一组不同的组件。Each cluster type supports a different set of components. 所有群集类型都支持 Hive。All cluster types support Hive. 有关 HDInsight 中受支持组件的列表,请参阅 HDInsight 提供的 Hadoop 群集版本中有哪些新功能?For a list of supported components in HDInsight, see What's new in the Hadoop cluster versions provided by HDInsight?

如果没有 Azure 订阅,请在开始前创建一个试用帐户If you don't have an Azure subscription, create a trial account before you begin.

创建 Hadoop 群集Create a Hadoop cluster

  1. 选择下面的“部署到 Azure”按钮以登录到 Azure,并在 Azure 门户中打开资源管理器模板 。Select the Deploy to Azure button below to sign in to Azure and open the Resource Manager template in the Azure portal.

    Deploy to Azure

    输入或选择下列值:Enter or select the following values:

    属性Property 说明Description
    订阅Subscription 选择 Azure 订阅。Select your Azure subscription.
    资源组Resource group 创建资源组,或选择现有资源组。Create a resource group or select an existing resource group. 资源组是 Azure 组件的容器。A resource group is a container of Azure components. 在此示例中,资源组包含 HDInsight 群集和依赖的 Azure 存储帐户。In this case, the resource group contains the HDInsight cluster and the dependent Azure Storage account.
    位置Location 选择要在其中创建群集的 Azure 位置。Select an Azure location where you want to create your cluster. 为获得更佳性能,请选择离你较近的位置。Choose a location closer to you for better performance.
    群集名称Cluster Name 输入 Hadoop 群集的名称。Enter a name for the Hadoop cluster. 由于 HDInsight 中的所有群集共享同一 DNS 命名空间,因此该名称必须唯一。Because all clusters in HDInsight share the same DNS namespace this name needs to be unique. 名称只能包含小写字母、数字和连字符,并且必须以字母开头。The name may only contain lowercase letters, numbers, and hyphens, and must begin with a letter. 每个连字符的前后必须为非连字符字符。Each hyphen must be preceded and followed by a non-hyphen character. 名称的长度还必须介于 3 到 59 个字符之间。The name must also be between 3 and 59 characters long.
    群集类型Cluster Type 选择“hadoop” 。Select hadoop.
    群集登录名和密码Cluster login name and password 默认登录名为“admin” 。密码长度不得少于 10 个字符,且至少必须包含一个数字、一个大写字母和一个小写字母、一个非字母数字字符(' " ` )字符除外)。The default login name is admin. The password must be at least 10 characters in length and must contain at least one digit, one uppercase, and one lower case letter, one non-alphanumeric character (except characters ' " ` ). 请确保不提供常见密码,如“Pass@word1” 。Make sure you do not provide common passwords such as "Pass@word1".
    SSH 用户名和密码SSH username and password 默认用户名为“sshuser” 。The default username is sshuser. 可以重命名 SSH 用户名。You can rename the SSH username. SSH 用户密码的要求与群集登录密码的要求相同。The SSH user password has the same requirements as the cluster login password.

    某些属性已在模板中硬编码。Some properties have been hardcoded in the template. 可以在模板中配置这些值。You can configure these values from the template. 有关这些属性的详细说明,请参阅在 HDInsight 中创建 Apache Hadoop 群集For more explanation of these properties, see Create Apache Hadoop clusters in HDInsight.

    备注

    提供的值必须唯一,并应遵循命名指南。The values you provide must be unique and should follow the naming guidelines. 模板不会执行验证检查。The template does not perform validation checks. 如果提供的值已被使用,或不遵循指南,则提交模板后可能会出错。If the values you provide are already in use, or do not follow the guidelines, you get an error after you have submitted the template.

    HDInsight Linux 入门之门户中的资源管理器模板HDInsight Linux gets started Resource Manager template on portal

  2. 选择“我同意上述条款和条件”,并选择“购买”。 Select I agree to the terms and conditions stated above, and then select Purchase. 你将收到一条通知,指出你的部署正在进行。You will receive a notification that your deployment is in progress. 创建群集大约需要 20 分钟时间。It takes about 20 minutes to create a cluster.

  3. 创建群集后,你将收到“部署成功” 通知,其中包含“转到资源组” 链接。Once the cluster is created, you will receive a Deployment succeeded notification with a Go to resource group link. 资源组页面将列出新的 HDInsight 群集以及与此群集关联的默认存储。Your Resource group page will list your new HDInsight cluster and the default storage associated with the cluster. 每个群集都有一个 Azure 存储帐户依赖项。Each cluster has an Azure Storage account dependency. 该帐户称为默认存储帐户。It is referred as the default storage account. HDInsight 群集及其默认存储帐户必须共存于同一个 Azure 区域中。The HDInsight cluster and its default storage account must be colocated in the same Azure region. 删除群集不会删除存储帐户。Deleting clusters does not delete the storage account.

备注

如需其他群集创建方法或要了解本快速入门中使用的属性,请参阅创建 HDInsight 群集For other cluster creation methods and understanding the properties used in this quickstart, see Create HDInsight clusters.

清理资源Clean up resources

完成本快速入门后,可以删除群集。After you complete the quickstart, you may want to delete the cluster. 有了 HDInsight,便可以将数据存储在 Azure 存储中,因此可以在群集不用时安全地删除群集。With HDInsight, your data is stored in Azure Storage, so you can safely delete a cluster when it is not in use. 此外,还需要支付 HDInsight 群集费用,即使未使用。You are also charged for an HDInsight cluster, even when it is not in use. 由于群集费用高于存储空间费用数倍,因此在不使用群集时将其删除可以节省费用。Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they are not in use.

备注

如果立即进行下一教程,了解如何使用 Hadoop on HDInsight 运行 ETL 操作,建议保持群集运行 。If you are immediately proceeding to the next tutorial to learn how to run ETL operations using Hadoop on HDInsight, you may want to keep the cluster running. 这是因为该教程中必须再次创建 Hadoop 群集。This is because in the tutorial you have to create a Hadoop cluster again. 但是,如果不立即学习下一教程,则必须立即删除该群集。However, if you are not going through the next tutorial right away, you must delete the cluster now.

删除群集和/或默认存储帐户To delete the cluster and/or the default storage account

  1. 返回到包含 Azure 门户的浏览器选项卡。Go back to the browser tab where you have the Azure portal. 你应该在群集概览页上。You shall be on the cluster overview page. 如果仅希望删除群集但保留默认的存储帐户,请选择“删除” 。If you only want to delete the cluster but retain the default storage account, select Delete.

    HDInsight 删除群集HDInsight delete cluster

  2. 如果希望删除群集和默认存储帐户,请选择资源组名称(之前的屏幕截图中已突出显示),打开资源组页。If you want to delete the cluster as well as the default storage account, select the resource group name (highlighted in the previous screenshot) to open the resource group page.

  3. 选择“删除资源组”,删除资源组(包括群集和默认存储帐户) 。Select Delete resource group to delete the resource group, which contains the cluster and the default storage account. 注意,删除资源组会删除存储帐户。Note deleting the resource group deletes the storage account. 如果想要保留存储帐户,请选择仅删除群集。If you want to keep the storage account, choose to delete the cluster only.

后续步骤Next steps

在本快速入门中,你已了解了如何使用资源管理器模板在 HDInsight 中创建 Apache Hadoop 群集。In this quickstart, you learned how to create an Apache Hadoop cluster in HDInsight using a Resource Manager template. 下一篇文章将介绍如何使用 Hadoop on HDInsight 执行提取、转换和加载 (ETL) 操作。In the next article, you learn how to perform an extract, transform, and load (ETL) operation using Hadoop on HDInsight.