使用 Azure 门户管理 HDInsight 中的 Apache Hadoop 群集Manage Apache Hadoop clusters in HDInsight by using the Azure portal

使用 Azure门户可以管理 Azure HDInsight 中的 Apache Hadoop 群集。Using the Azure portal, you can manage Apache Hadoop clusters in Azure HDInsight. 使用上述选项卡选择器,可以了解如何使用其他工具在 HDInsight 中管理 Hadoop 群集。Use the tab selector above for information on managing Hadoop clusters in HDInsight using other tools.

先决条件Prerequisites

HDInsight 中的现有 Apache Hadoop 群集。An existing Apache Hadoop cluster in HDInsight. 请参阅使用 Azure 门户在 HDInsight 中创建基于 Linux 的群集See Create Linux-based clusters in HDInsight using the Azure portal.

入门Getting Started

登录到 https://portal.azure.cnSign in to https://portal.azure.cn.

列出并显示群集List and show clusters

从“HDInsight 群集”页会列出现有的群集 。The HDInsight clusters page will list your existing clusters. 在门户中:From the portal:

  1. 在左侧菜单中,选择“所有服务” 。Select All services from the left menu.
  2. 在“ANALYTICS”下,选择“HDInsight 群集” 。Select HDInsight clusters under ANALYTICS.

群集主页Cluster home page

从“HDInsight 群集”页选择群集名称 。Select your cluster name from the HDInsight clusters page. 此时会打开“概览”视图,该视图类似于下图: This will open the Overview view, which looks similar to the following image:

Azure 门户 HDInsight 群集概要

顶部菜单:Top menu:

项目Item 说明Description
移动Move 将群集移至其他资源组或其他订阅。Moves the cluster to another resource group or to another subscription.
DeleteDelete 删除群集。Deletes the cluster.
刷新Refresh 刷新视图。Refreshes the view.

Left menu:Left menu:

  • 顶部左侧菜单Top-left menu

    项目Item 说明Description
    概述Overview 提供群集的常规信息。Provides general information for your cluster.
    活动日志Activity log 显示和查询活动日志。Show and query activity logs.
    访问控制 (IAM)Access control (IAM) 使用角色分配。Use role assignments. 请参阅使用角色分配管理对 Azure 订阅资源的访问权限See Use role assignments to manage access to your Azure subscription resources.
    TagsTags 可让用户设置键/值对,以定义云服务的自定义分类。Allows you to set key/value pairs to define a custom taxonomy of your cloud services. 例如,用户可以创建名为 project的键,并对与特定项目关联的所有服务使用一个公用值。For example, you may create a key named project, and then use a common value for all services associated with a specific project.
    诊断并解决问题Diagnose and solve problems 显示故障排除信息。Display troubleshooting information.
    快速入门Quickstart 显示可帮助你开始使用 HDInsight 的信息。Displays information that helps you get started using HDInsight.
    工具Tools HDInsight 相关工具的帮助信息。Help information for HDInsight related tools.
  • “设置”菜单Settings menu

    项目Item 说明Description
    群集大小Cluster size 检查、增加和减少群集辅助角色节点的数量。Check, increase, and decrease the number of cluster worker nodes. 请参阅缩放群集See Scale clusters.
    配额限制Quota limits 显示订阅的已用核心数和可用核心数。Display the used and available cores for your subscription.
    SSH + 群集登录SSH + Cluster login 显示使用安全 Shell (SSH) 连接与群集建立连接的说明。Shows the instructions to connect to the cluster using Secure Shell (SSH) connection. 有关详细信息,请参阅 将 SSH 与 HDInsight 配合使用For more information, see Use SSH with HDInsight.
    存储帐户Storage accounts 查看存储帐户和密钥。View the storage accounts and the keys. 存储帐户是在群集创建过程中进行配置。The storage accounts are configured during the cluster creation process.
    应用程序Applications 添加/删除 HDInsight 应用程序。Add/remove HDInsight applications. 请参阅安装自定义 HDInsight 应用程序See Install custom HDInsight applications.
    脚本操作Script actions 在群集上运行 Bash 脚本。Run Bash scripts on the cluster. 请参阅使用脚本操作自定义基于 Linux 的 HDInsight 群集See Customize Linux-based HDInsight clusters using Script Action.
    外部元存储External metastores 查看 Apache HiveApache Oozie 元存储。View the Apache Hive and Apache Oozie metastores. 只能在群集创建过程中配置元存储。The metastores can only be configured during the cluster creation process.
    HDInsight 合作伙伴HDInsight partner 添加/删除当前 HDInsight 合作伙伴。Add/remove the current HDInsight Partner.
    属性Properties 查看群集属性View the cluster properties.
    Locks 添加锁防止群集遭到修改或删除。Add a lock to prevent the cluster being modified or deleted.
    导出模板Export template 显示和导出群集的 Azure 资源管理器模板。Display and export the Azure Resource Manager template for the cluster. 目前,只能导出相关的 Azure 存储帐户。Currently, you can only export the dependent Azure storage account. 请参阅使用 Azure 资源管理器模板在 HDInsight 中创建基于 Linux 的 Apache Hadoop 群集See Create Linux-based Apache Hadoop clusters in HDInsight using Azure Resource Manager templates.
  • 监视菜单Monitoring menu

    项目Item 说明Description
    警报Alerts 管理警报和操作。Manage the alerts and actions.
    指标Metrics 监视 Azure Monitor 日志中的群集指标。Monitor the cluster metrics in Azure Monitor logs.
    诊断设置Diagnosis settings 存储诊断指标的位置设置。Settings on where to store the diagnosis metrics.
    Azure MonitorAzure Monitor 在 Azure Monitor 中监视群集。Monitor your cluster in Azure Monitor.
  • 支持 + 故障排除菜单Support + troubleshooting menu

    项目Item 说明Description
    资源运行状况Resource health 参阅 Azure 资源运行状况概述See Azure resource health overview.
    新建支持请求New support request 允许用户通过 Microsoft 支持创建支持票证。Allows you to create a support ticket with Microsoft support.

群集属性Cluster Properties

在“群集主页”的“设置”下,选择“属性”。 From the cluster home page, under Settings select Properties.

项目Item 说明Description
主机名Hostname 群集名称。Cluster name.
群集 URLCluster URL Ambari Web 界面的 URL。The URL for the Ambari web interface.
专用终结点Private Endpoint 群集的专用终结点。The private endpoint for the cluster.
安全外壳 (SSH)Secure shell (SSH) 用于通过 SSH 访问群集的用户名和主机名。The username and host name to use in accessing the cluster via SSH.
状态Status 下列其中一项:Aborted、Accepted、ClusterStorageProvisioned、AzureVMConfiguration、HDInsightConfiguration、Operational、Running、Error、Deleting、Deleted、Timedout、DeleteQueued、DeleteTimedout、DeleteError、PatchQueued、CertRolloverQueued、ResizeQueued 或 ClusterCustomization。One of: Aborted, Accepted, ClusterStorageProvisioned, AzureVMConfiguration, HDInsightConfiguration, Operational, Running, Error, Deleting, Deleted, Timedout, DeleteQueued, DeleteTimedout, DeleteError, PatchQueued, CertRolloverQueued, ResizeQueued, or ClusterCustomization.
区域Region Azure 位置。Azure location. 有关受支持的 Azure 位置的列表,请参阅 HDInsight 定价中的“区域” 下拉列表框。For a list of supported Azure locations, see the Region drop-down list box on HDInsight pricing.
创建日期DATE CREATED 部署群集的日期。The date the cluster was deployed.
操作系统OPERATING SYSTEM “Windows”或“Linux”。 Either Windows or Linux.
类型TYPE Hadoop、HBase、Storm、Spark。Hadoop, HBase, Storm, Spark.
版本Version 请参阅 HDInsight 版本See HDInsight versions.
最低 TLS 版本Minimum TLS version TLS 版本。The TLS version.
订阅SUBSCRIPTION 订阅名称。Subscription name.
默认数据源DEFAULT DATA SOURCE 默认的群集文件系统。The default cluster file system.
工作器节点大小Worker nodes sizes 工作节点的所选 VM 大小。The selected VM size of the worker nodes.
头节点大小Head node size 头节点的所选 VM 大小。The selected VM size of the head nodes.
虚拟网络Virtual network 群集将要部署到的虚拟网络的名称(如果在部署时已选择)。The name of the Virtual Network, which the cluster is deployed, if one was selected at deployment time.

移动群集Move clusters

可以将 HDInsight 群集移到另一个 Azure 资源组或另一个订阅。You can move an HDInsight cluster to another Azure resource group or another subscription.

群集主页中执行以下操作:From the cluster home page:

  1. 在顶部菜单中选择“移动”。 Select Move from the top menu.
  2. 选择“移动到另一资源组”或“移动到另一订阅”。 Select Move to another resource group or Move to another subscription.
  3. 按新页面中的说明操作。Follow the instructions from the new page.

删除群集Delete clusters

删除群集不会删除默认存储帐户或任何链接的存储帐户。Deleting a cluster does not delete the default storage account nor any linked storage accounts. 可以使用相同的存储帐户和相同的元存储来重新创建群集。You can re-create the cluster by using the same storage accounts and the same metastores. 建议在重新创建群集时使用新的默认 Blob 容器。We recommend using a new default Blob container when you re-create the cluster.

群集主页中执行以下操作:From the cluster home page:

  1. 从顶部菜单中选择“删除” 。Select Delete from the top menu.
  2. 按新页面中的说明操作。Follow the instructions from the new page.

另请参阅暂停/关闭群集See also Pause/shut down clusters.

添加其他存储帐户Add additional storage accounts

创建群集后,可以添加其他 Azure 存储帐户和 Azure Data Lake Storage 帐户。You can add additional Azure Storage accounts and Azure Data Lake Storage accounts after a cluster is created. 有关详细信息,请参阅将其他存储帐户添加到 HDInsightFor more information, see Add additional storage accounts to HDInsight.

缩放群集Scale clusters

使用群集缩放功能可更改 Azure HDInsight 群集使用的辅助角色节点数,而无需重新创建群集。The cluster scaling feature allows you to change the number of worker nodes used by an Azure HDInsight cluster, without having to re-create the cluster.

有关完整信息,请参阅缩放 HDInsight 群集See Scale HDInsight clusters for complete information.

暂停/关闭群集Pause/shut down clusters

大多数 Hadoop 作业只是偶尔运行的批处理作业。Most of Hadoop jobs are batch jobs that are only run occasionally. 大多数 Hadoop 群集都存在长时间不进行处理的情况。For most Hadoop clusters, there are large periods of time that the cluster is not being used for processing. 有了 HDInsight,便可以将数据存储在 Azure 存储中,因此可以在群集不用时安全地删除群集。With HDInsight, your data is stored in Azure Storage, so you can safely delete a cluster when it is not in use. 此外,还需要支付 HDInsight 群集费用,即使未使用。You are also charged for an HDInsight cluster, even when it is not in use. 由于群集费用高于存储空间费用数倍,因此在不使用群集时将其删除可以节省费用。Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they are not in use.

可以通过许多方式对此过程进行程序性处理:There are many ways you can program the process:

有关定价信息,请参阅 HDInsight 定价For the pricing information, see HDInsight pricing. 要从门户中删除群集,请参阅 删除群集To delete a cluster from the Portal, see Delete clusters

升级群集Upgrade clusters

请参阅将 HDInsight 群集升级到更新的版本See Upgrade HDInsight cluster to a newer version.

打开 Apache Ambari Web UIOpen the Apache Ambari web UI

Ambari 提供由其 RESTful API 提供支持的直观、易用的 Hadoop 管理 Web UI。Ambari provides an intuitive, easy-to-use Hadoop management web UI backed by its RESTful APIs. Ambari 使系统管理员能够管理和监视 Hadoop 群集。Ambari enables system administrators to manage and monitor Hadoop clusters.

群集主页中执行以下操作:From the cluster home page:

  1. 选择“群集仪表板”。 Select Cluster dashboards.

    HDInsight Hadoop 群集菜单

  2. 从新页面中选择“Ambari 主页”。 Select Ambari home from the new page.

  3. 输入群集用户名和密码。Enter the cluster username and password. 默认群集用户名为“admin” 。The default cluster username is admin.

有关详细信息,请参阅使用 Apache Ambari Web UI 管理 HDInsight 群集For more information, see Manage HDInsight clusters by using the Apache Ambari Web UI.

更改密码Change passwords

HDInsight 群集可以有两个用户帐户。An HDInsight cluster can have two user accounts. HDInsight 群集用户帐户(HTTP 用户帐户)和 SSH 用户帐户是在创建过程中创建的。The HDInsight cluster user account (HTTP user account) and the SSH user account are created during the creation process. 可以使用门户更改群集用户帐户密码,使用脚本操作更改 SSH 用户帐户。You can use the portal to change the cluster user account password, and script actions to change the SSH user account.

更改群集用户密码Change the cluster user password

备注

更改群集用户 (admin) 的密码可能会导致针对此群集运行的脚本操作失败。Changing the cluster user (admin) password may cause script actions run against this cluster to fail. 如果有任何持久性脚本操作以工作节点为目标,则通过重设大小操作在群集中添加节点时,这些脚本可能会失败。If you have any persisted script actions that target worker nodes, these scripts may fail when you add nodes to the cluster through resize operations. 有关脚本操作的详细信息,请参阅使用脚本操作自定义 HDInsight 群集For more information on script actions, see Customize HDInsight clusters using script actions.

群集主页中执行以下操作:From the cluster home page:

  1. 在“设置”下选择“SSH + 群集登录” 。Select SSH + Cluster login under Settings.
  2. 选择“重置凭据”。 Select Reset credential.
  3. 在文本框中输入并确认新密码。Enter and confirm new password in the text boxes.
  4. 选择“确定” 。Select OK.

将在群集中的所有节点上更改密码。The password is changed on all nodes in the cluster.

更改 SSH 用户密码Change the SSH user password

  1. 使用文本编辑器将以下文本保存到名为 changepassword.sh的文件中。Using a text editor, save the following text as a file named changepassword.sh.

    重要

    所用的编辑器必须使用 LF 作为行尾。You must use an editor that uses LF as the line ending. 如果编辑器使用 CRLF,则脚本将无法正常工作。If the editor uses CRLF, then the script does not work.

    #! /bin/bash
    USER=$1
    PASS=$2
    usermod --password $(echo $PASS | openssl passwd -1 -stdin) $USER
    
  2. 将该文件上传到可以使用 HTTP 或 HTTPS 地址从 HDInsight 访问的存储位置。Upload the file to a storage location that can be accessed from HDInsight using an HTTP or HTTPS address. 例如,某个公共文件存储(如 OneDrive 或 Azure Blob 存储)。For example, a public file store such as OneDrive or Azure Blob storage. 将 URI(HTTP 或 HTTPS 地址)保存到文件中,因为下一步需要用到此 URI。Save the URI (HTTP or HTTPS address) to the file, as this URI is needed in the next step.

  3. 群集主页的“设置”下,选择“脚本操作”。 From the cluster home page, select Script actions under Settings.

  4. 在“脚本操作”页中,选择“提交新项” 。From the Script actions page, select Submit new.

  5. 在“提交脚本操作”页中,输入以下信息: From the Submit script action page, enter the following information:

    字段Field ValueValue
    脚本类型Script type 从下拉列表中选择“- 自定义”。 Select - Custom from the drop-down list.
    名称Name “更改 ssh 密码”"Change ssh password"
    Bash 脚本 URIBash script URI changepassword.sh 文件的 URIThe URI to the changepassword.sh file
    节点类型:(头节点、辅助角色节点、Nimbus 节点、监督器节点或 Zookeeper 节点。)Node type(s): (Head, Worker, Nimbus, Supervisor, or Zookeeper.) ✓ 适用于所有列出的节点类型✓ for all node types listed
    parametersParameters 输入 SSH 用户名和新密码。Enter the SSH user name and then the new password. 用户名与密码之间应有一个空格。There should be one space between the user name and the password.
    保留此脚本操作...Persist this script action ... 让此字段保留未选中状态。Leave this field unchecked.
  6. 选择“创建” 以应用脚本。Select Create to apply the script. 脚本完成后,可以使用新密码通过 SSH 连接到群集。Once the script finishes, you're able to connect to the cluster using SSH with the new password.

查找订阅 IDFind the subscription ID

每个群集都绑定到一个 Azure 订阅。Each cluster is tied to an Azure subscription. Azure 订阅 ID 在群集主页中可见。The Azure subscription ID is visible from the cluster home page.

查找资源组Find the resource group

在 Azure Resource Manager 模式下,每个 HDInsight 群集都是使用 Azure Resource Manager 组创建的。In the Azure Resource Manager mode, each HDInsight cluster is created with an Azure Resource Manager group. 资源管理器组在群集主页中可见。The Resource Manager group is visible from the cluster home page.

查找存储帐户Find the storage accounts

HDInsight 群集使用 Azure 存储帐户或 Azure Data Lake Storage 来存储数据。HDInsight clusters use either an Azure Storage account or Azure Data Lake Storage to store data. 每个 HDInsight 群集都可拥有一个默认存储帐户和多个链接的存储帐户。Each HDInsight cluster can have one default storage account and a number of linked storage accounts. 若要列出存储帐户,请在群集主页的“设置”下选择“存储帐户”。 To list the storage accounts, from the cluster home page under Settings, select Storage accounts.

监视作业Monitor jobs

请参阅使用 Apache Ambari Web UI 管理 HDInsight 群集See Manage HDInsight clusters by using the Apache Ambari Web UI.

群集大小Cluster size

群集主页中的“群集大小” 磁贴显示分配给此群集的核心数以及如何为此群集中的节点分配核心。The Cluster size tile from the cluster home page displays the number of cores allocated to this cluster and how they are allocated for the nodes within this cluster.

重要

若要监视 HDInsight 群集提供的服务,必须使用 Ambari Web 或 Ambari REST API。To monitor the services provided by the HDInsight cluster, you must use Ambari Web or the Ambari REST API. 有关如何使用 Ambari 的详细信息,请参阅使用 Apache Ambari 管理 HDInsight 群集For more information on using Ambari, see Manage HDInsight clusters using Apache Ambari

连接到群集Connect to a cluster

后续步骤Next steps

本文介绍了一些基本管理功能。In this article, you learned some basic administrative functions. 要了解更多信息,请参阅下列文章:To learn more, see the following articles: