使用 Azure 门户在 HDInsight 中创建基于 Linux 的群集Create Linux-based clusters in HDInsight by using the Azure portal

Azure 门户是一种基于 Web 的管理工具,用于管理 Microsoft Azure 云中托管的服务和资源。The Azure portal is a web-based management tool for services and resources hosted in the Microsoft Azure cloud. 本文介绍如何使用门户创建基于 Linux 的 Azure HDInsight 群集。In this article, you learn how to create Linux-based Azure HDInsight clusters by using the portal. 创建 HDInsight 群集一文提供了其他详细信息。Additional details are available from Create HDInsight clusters.

警告

HDInsight 群集是基于分钟按比例计费,而不管用户是否使用它们。Billing for HDInsight clusters is prorated per minute, whether you use them or not. 请务必在使用完群集之后将其删除。Be sure to delete your cluster after you finish using it. 请参阅如何删除 HDInsight 群集See how to delete an HDInsight cluster.

Azure 门户会公开大部分的群集属性。The Azure portal exposes most of the cluster properties. 使用 Azure 资源管理器模板可以隐藏许多详细信息。By using Azure Resource Manager templates, you can hide many details. 有关详细信息,请参阅使用资源管理器模板在 HDInsight 中创建 Apache Hadoop 群集For more information, see Create Apache Hadoop clusters in HDInsight by using Resource Manager templates.

如果没有 Azure 订阅,可在开始前创建一个试用帐户If you don't have an Azure subscription, create a trial account before you begin.

创建群集Create clusters

备注

需要安全传输的功能强制通过安全连接来实施针对帐户的所有请求。The feature that requires secure transfer enforces all requests to your account through a secure connection. 仅 HDInsight 群集 3.6 或更高版本支持此功能。Only HDInsight cluster version 3.6 or newer supports this feature. 有关详细信息,请参阅在 Azure HDInsight 中使用安全传输存储帐户创建 Apache Hadoop 群集For more information, see Create Apache Hadoop cluster with secure transfer storage accounts in Azure HDInsight.

  1. 登录到 Azure 门户Sign in to the Azure portal.

  2. 在顶部菜单中,选择“+ 创建资源”。From the top menu, select + Create a resource.

    在 Azure 门户中创建新群集Create a new cluster in the Azure portal

  3. 选择“分析” > “Azure HDInsight”,转到“创建 HDInsight 群集”页。Select Analytics > Azure HDInsight to go to the Create HDInsight cluster page.

基础知识Basics

HDInsight 创建群集基本信息HDInsight create cluster basics

在“基本信息”选项卡中提供以下信息:From the Basics tab, provide the following information:

属性Property 说明Description
订阅Subscription 从下拉列表中选择用于此群集的 Azure 订阅。From the drop-down list, select the Azure subscription that's used for the cluster.
资源组Resource group 从下拉列表中选择现有资源组,或选择“新建”。From the drop-down list, select your existing resource group, or select Create new.
群集名称Cluster name 输入任何全局唯一的名称。Enter a globally unique name.
区域Region 从下拉列表中,选择在其中创建群集的区域。From the drop-down list, select a region where the cluster is created.
群集类型Cluster type 单击“选择群集类型”,打开一个列表。Click Select cluster type to open a list. 从列表中选择所需的群集类型。From the list, select the wanted cluster type. HDInsight 群集有不同的类型。HDInsight clusters come in different types. 这些类型与该群集进行优化的工作负荷或技术相对应。They correspond to the workload or technology that the cluster is tuned for. 没有任何方法支持创建组合多种类型的群集,There's no supported method to create a cluster that combines multiple types.
版本Version 从下拉列表中,选择一个版本From the drop-down list, select a version. 如果不知道要选择哪个版本,请使用默认版本。Use the default version if you don't know what to choose. 有关详细信息,请参阅 HDInsight 群集版本For more information, see HDInsight cluster versions.
群集登录用户名Cluster login username 提供用户名,默认为 adminProvide the username, default is admin.
群集登录密码Cluster login password 提供密码。Provide the password.
确认群集登录密码Confirm cluster login password 重新输入密码Reenter the password
安全外壳 (SSH) 用户名Secure Shell (SSH) username 提供用户名,默认为 sshuserProvide the username, default is sshuser
对 SSH 使用群集登录密码Use cluster login password for SSH 如果希望 SSH 密码与此前指定的管理员密码相同,则选中“对 SSH 使用群集登录密码”复选框。If you want the same SSH password as the admin password you specified earlier, select the Use cluster login password for SSH check box. 否则,请提供“密码”或“公钥”来验证 SSH 用户。If not, provide either a PASSWORD or PUBLIC KEY to authenticate the SSH user. 建议的方法是公钥。A public key is the approach we recommend. 选择底部的“选择”,保存凭据配置。Choose Select at the bottom to save the credentials configuration. 有关详细信息,请参阅使用 SSH 连接到 HDInsight (Apache Hadoop)For more information, see Connect to HDInsight (Apache Hadoop) by using SSH.

在完成时选择“下一步:存储 >>”,转到下一选项卡。Select Next: Storage >> to advance to the next tab.

存储Storage

警告

从 2020 年 6 月 15 日开始,客户将无法使用 HDInsight 创建新的服务主体。Starting June 15th, 2020 customers will not be able to create new service principal using HDInsight. 请参阅使用 Azure Active Directory 创建服务主体和证书See Create Service Principal and Certificates using Azure Active Directory.

HDInsight 创建群集存储HDInsight create cluster storage

主存储Primary storage

从”主存储类型”的下拉列表中,选择默认存储类型。From the drop-down list for Primary storage type, select your default storage type. 要完成的后续字段将因选择而异。The later fields to complete will vary based upon your selection. 对于 Azure 存储For Azure Storage:

  1. 至于“选择方法”,请选择“从列表中选择”或“使用访问密钥”。 For Selection method, choose either Select from list, or Use access key.

    • 接下来,对于“从列表中选择”,请从下拉列表中选择“主存储帐户”,或者选择“新建”。For Select from list, then select your Primary storage account from the drop-down list, or select Create new.
    • 对于“使用访问密钥”,请输入存储帐户名称For Use access key, enter your Storage account name. 然后,请提供访问密钥Then provide the Access key.
  2. 对于“容器”,请接受默认值,或者输入一个新值。For Container, accept the default value, or enter a new one.

其他 Azure 存储Additional Azure Storage

可选:选择“添加 Azure 存储”,获取其他群集存储。Optional: Select Add Azure Storage for additional cluster storage. 不支持在 HDInsight 群集之外的其他区域使用别的存储帐户。Using an additional storage account in a different region than the HDInsight cluster isn't supported.

元存储设置Metastore Settings

可选:指定现有的 SQL 数据库,将 Apache Hive、Apache Oozie 和/或 Apache Ambari 元数据保存在群集之外。Optional: Specify an existing SQL Database to save Apache Hive, Apache Oozie, and/or Apache Ambari metadata outside of the cluster. 用于元存储的 Azure SQL 数据库必须允许连接到其他 Azure 服务,包括 Azure HDInsight。The Azure SQL Database that's used for the metastore must allow connectivity to other Azure services, including Azure HDInsight. 创建元存储时,请勿使用短划线或连字符来命名数据库。When you create a metastore, don't name a database with dashes or hyphens. 这些字符可能导致群集创建过程失败。These characters can cause the cluster creation process to fail.

重要

对于支持元存储的群集形状,默认元存储提供具有基本层 5 DTU 限制(不可升级)的 Azure SQL 数据库!For cluster shapes that support metastores, the default metastore provides an Azure SQL Database with a basic tier 5 DTU limit (not upgradeable)! 适用于基本测试目的。Suitable for basic testing purposes. 对于大型或生产工作负载,我们建议迁移到外部元存储。For large or production workloads, we recommend migrating to an external metastore.

在完成时选择“下一步:安全性 + 网络 >>”,转到下一选项卡。Select Next: Security + networking >> to advance to the next tab.

安全性 + 网络Security + networking

HDInsight 创建群集安全网络HDInsight create cluster security networking

在“安全性 + 网络”选项卡中提供以下信息:From the Security + networking tab, provide the following information:

属性Property 说明Description
企业安全数据包Enterprise security package 可选:选中此复选框可使用“企业安全性套餐”。Optional: Select the check box to use Enterprise Security Package. 有关详细信息,请参阅使用 Azure Active Directory 域服务配置具有企业安全性套餐的 HDInsight 群集For more information, see Configure a HDInsight cluster with Enterprise Security Package by using Azure Active Directory Domain Services.
TLSTLS 可选:从下拉列表中选择 TLS 版本。Optional: Select a TLS version from the drop-down list. 有关详细信息,请参阅传输层安全性For more information, see Transport Layer Security.
虚拟网络Virtual network 可选:从下拉列表中选择现有的虚拟网络和子网。Optional: Select an existing virtual network and subnet from the drop-down list. 有关信息,请参阅为 Azure HDInsight 群集规划虚拟网络部署For information, see Plan a virtual network deployment for Azure HDInsight clusters. 本文包含虚拟网络的特定配置要求。The article includes specific configuration requirements for the virtual network.
磁盘加密设置Disk encryption settings 可选:选中此复选框即可使用加密。Optional: Select the check box to use encryption. 有关详细信息,请参阅客户管理的密钥磁盘加密For more information, see Customer-managed key disk encryption.
Kafka REST 代理Kafka REST proxy 此设置仅适用于群集类型 Kafka。This setting is only available for cluster type Kafka. 有关详细信息,请参阅使用 REST 代理For more information, see Using a REST proxy.
标识Identity 可选:从下拉列表中选择一个用户分配的现有服务标识。Optional: Select an existing user-assigned service identity from the drop-down list. 有关详细信息,请参阅 Azure HDInsight 中的托管标识For more information, see Managed identities in Azure HDInsight.

在完成时选择“下一步:配置 + 定价 >>”,转到下一选项卡。Select Next: Configuration + pricing >> to advance to the next tab.

配置 + 定价Configuration + pricing

HDInsight 创建群集配置HDInsight create cluster configuration

在“配置 + 定价”选项卡中提供以下信息:From the Configuration + pricing tab, provide the following information:

属性Property 说明Description
+ 添加应用程序+ Add application 可选:选择所需的任何应用程序。Optional: Select any applications that you want. Microsoft、独立软件供应商 (ISV) 或你自己都可以开发这些应用程序。Microsoft, independent software vendors (ISVs), or you can develop these applications. 有关详细信息,请参阅在群集创建期间安装应用程序For more information, see Install applications during cluster creation.
节点大小Node size 可选:选择不同大小的节点。Optional: Select a different-sized node.
节点数Number of nodes 可选:输入指定节点类型的节点数。Optional: Enter the number of nodes for the specified node type. 如果计划使用 32 个以上的辅助角色节点,则请选择至少具有 8 个核心和 14 GB RAM 的头节点大小。If you plan on more than 32 worker nodes, select a head node size with at least eight cores and 14-GB RAM. 可以在创建群集时计划节点,也可以在创建群集之后通过缩放群集来计划节点。Plan the nodes either at cluster creation or by scaling the cluster after creation.
启用自动缩放Enable autoscale 可选:选中相应的复选框以启用该功能。Optional: Select the checkbox to enable the feature. 有关详细信息,请参阅自动缩放 Azure HDInsight 群集For more information, see Automatically scale Azure HDInsight clusters.
+ 添加脚本操作+ Add script action 可选:如果要在创建群集时使用自定义脚本来自定义群集,请使用此选项。Optional: This option works if you want to use a custom script to customize a cluster, as the cluster is being created. 有关脚本操作的详细信息,请参阅使用脚本操作自定义基于 Linux 的 HDInsight 群集For more information about script actions, see Customize Linux-based HDInsight clusters by using script actions.

选择“查看 + 创建>>”,验证群集配置并转到最后一个选项卡。Select Review + create >> to validate the cluster configuration and advance to the final tab.

查看 + 创建Review + create

HDInsight 创建群集摘要HDInsight create cluster summary

查看设置。Review the settings. 选择“创建”可创建群集。Select Create to create the cluster.

创建群集需要一些时间,通常约 20 分钟左右。It takes some time for the cluster to be created, usually around 20 minutes. 监视“通知”以检查预配进程。Monitor Notifications to check on the provisioning process.

创建帖子Post creation

创建进程完成后,选择“部署成功”通知中的“转到资源” 。After the creation process finishes, select Go to Resource from the Deployment succeeded notification. 群集窗口提供以下信息。The cluster window provides the following information.

HDI Azure 门户群集概述HDI Azure portal cluster overview

窗口中的某些图标解释如下:Some of the icons in the window are explained as follows:

属性Property 说明Description
概述Overview 提供有关群集的所有基本信息。Provides all the essential information about the cluster. 例如,名称、其所属的资源组、位置、操作系统、群集仪表板 URL。Examples are the name, the resource group it belongs to, the location, the operating system, and the URL for the cluster dashboard.
群集仪表板Cluster dashboards 将你定向到与群集关联的 Ambari 门户。Directs you to the Ambari portal associated with the cluster.
SSH + 群集登录SSH + Cluster login 提供使用 SSH 访问群集时所需的信息。Provides information needed to access the cluster by using SSH.
DeleteDelete 删除 HDInsight 群集。Deletes the HDInsight cluster.

删除群集Delete the cluster

请参阅使用浏览器、PowerShell 或 Azure CLI 删除 HDInsight 群集See Delete an HDInsight cluster using your browser, PowerShell, or the Azure CLI.

故障排除Troubleshoot

如果在创建 HDInsight 群集时遇到问题,请参阅访问控制要求If you run into issues with creating HDInsight clusters, see access control requirements.

后续步骤Next steps

你已成功创建 HDInsight 群集。You've successfully created an HDInsight cluster. 现在可以了解如何使用群集了。Now learn how to work with your cluster.