自动缩放 Azure HDInsight 群集Automatically scale Azure HDInsight clusters

Azure HDInsight 的免费“自动缩放”功能可根据先前设置的条件自动增加或减少群集中的工作器节点数。Azure HDInsight's free Autoscale feature can automatically increase or decrease the number of worker nodes in your cluster based on previously set criteria. 在群集创建过程中,你可以设置最小和最大节点数,使用日期时间计划或特定性能指标建立缩放条件,其余的事项将由 HDInsight 平台完成。You set a minimum and maximum number of nodes during cluster creation, establish the scaling criteria using a day-time schedule or specific performance metrics, and the HDInsight platform does the rest.

工作原理How it works

自动缩放功能使用两种类型的条件来触发缩放事件:各种群集性能指标的阈值(称为“基于负载的缩放”)和基于时间的触发器(称为“基于计划的缩放”)。The Autoscale feature uses two types of conditions to trigger scaling events: thresholds for various cluster performance metrics (called load-based scaling) and time-based triggers (called schedule-based scaling). 基于负载的缩放会在设置的范围内更改群集中的节点数,以确保获得最佳的 CPU 使用率并尽量降低运行成本。Load-based scaling changes the number of nodes in your cluster, within a range that you set, to ensure optimal CPU usage and minimize running cost. 基于计划的缩放根据你关联到特定日期和时间的操作更改群集中的节点数。Schedule-based scaling changes the number of nodes in your cluster based on operations that you associate with specific dates and times.

选择基于负载或基于计划的缩放Choosing load-based or schedule-based scaling

在选择缩放类型时请考虑以下因素:Consider the following factors when choosing a scaling type:

  • 负载差异:群集的负载是否在特定日期的特定时间遵循一致的模式?Load variance: does the load of the cluster follow a consistent pattern at specific times, on specific days? 如果不是,则最好使用基于负载的计划。If not, load based scheduling is a better option.
  • SLA 要求:自动缩放型缩放是响应性的,而不是预测性的。SLA requirements: Autoscale scaling is reactive instead of predictive. 在负载开始增加以后,是否有足够的延迟来确保将群集设置为目标大小?Will there be a sufficient delay between when the load starts to increase and when the cluster needs to be at its target size? 如果有严格的 SLA 要求,且负载是固定的已知模式,则“基于计划”是更好的选项。If there are strict SLA requirements and the load is a fixed known pattern, 'schedule based' is a better option.

群集指标Cluster metrics

自动缩放会持续监视群集并收集以下指标:Autoscale continuously monitors the cluster and collects the following metrics:

指标Metric 说明Description
总待处理 CPUTotal Pending CPU 开始执行所有待处理容器所需的核心总数。The total number of cores required to start execution of all pending containers.
总待处理内存Total Pending Memory 开始执行所有待处理容器所需的总内存(以 MB 为单位)。The total memory (in MB) required to start execution of all pending containers.
总可用 CPUTotal Free CPU 活动工作节点上所有未使用核心的总和。The sum of all unused cores on the active worker nodes.
总可用内存Total Free Memory 活动工作节点上未使用内存的总和(以 MB 为单位)。The sum of unused memory (in MB) on the active worker nodes.
每个节点的已使用内存Used Memory per Node 工作节点上的负载。The load on a worker node. 使用了 10 GB 内存的工作节点的负载被认为比使用了 2 GB 内存的工作节点的负载更大。A worker node on which 10 GB of memory is used, is considered under more load than a worker with 2 GB of used memory.
每个节点的应用程序主机数Number of Application Masters per Node 在工作节点上运行的应用程序主机 (AM) 容器的数量。The number of Application Master (AM) containers running on a worker node. 托管两个 AM 容器的工作节点被认为比托管零个 AM 容器的工作节点更重要。A worker node that is hosting two AM containers, is considered more important than a worker node that is hosting zero AM containers.

每 60 秒检查一次上述指标。The above metrics are checked every 60 seconds. 可以使用这些指标中的任意一个来设置群集的缩放操作。You can setup scaling operations for your cluster using any of these metrics.

基于负载的缩放条件Load-based scale conditions

检测到以下情况时,自动缩放将发出缩放请求:When the following conditions are detected, Autoscale will issue a scale request:

纵向扩展Scale-up 纵向缩减Scale-down
总待处理 CPU 大于总可用 CPU 的时间超过 3 分钟。Total pending CPU is greater than total free CPU for more than 3 minutes. 总待处理 CPU 小于总可用 CPU 的时间超过 10 分钟。Total pending CPU is less than total free CPU for more than 10 minutes.
总待处理内存大于总可用内存的时间超过 3 分钟。Total pending memory is greater than total free memory for more than 3 minutes. 总待处理内存小于总可用内存的时间超过 10 分钟。Total pending memory is less than total free memory for more than 10 minutes.

对于纵向扩展,自动缩放会发出纵向扩展请求来添加所需数量的节点。For scale-up, Autoscale issues a scale-up request to add the required number of nodes. 纵向扩展基于的条件是:需要多少新的工作器节点才能满足当前的 CPU 和内存要求。The scale-up is based on how many new worker nodes are needed to meet the current CPU and memory requirements.

对于纵向缩减,自动缩放会发出请求来删除一定数量的节点。For scale-down, Autoscale issues a request to remove a certain number of nodes. 纵向缩减基于的条件是:每个节点的 AM 容器数、The scale-down is based on the number of AM containers per node. 当前的 CPU 和内存要求。And the current CPU and memory requirements. 此服务还会根据当前作业执行情况,检测待删除的节点。The service also detects which nodes are candidates for removal based on current job execution. 纵向缩减操作首先关闭节点,然后将其从群集中删除。The scale down operation first decommissions the nodes, and then removes them from the cluster.

群集兼容性Cluster compatibility

重要

Azure HDInsight 自动缩放功能于 2019 年 11 月 7 日正式发布,适用于 Spark 和 Hadoop 群集,并包含了该功能预览版本中未提供的改进。The Azure HDInsight Autoscale feature was released for general availability on November 7th, 2019 for Spark and Hadoop clusters and included improvements not available in the preview version of the feature. 如果你在 2019 年 11 月 7 日之前创建了 Spark 群集,并希望在群集上使用自动缩放功能,我们建议创建新群集,并在新群集上启用自动缩放。If you created a Spark cluster prior to November 7th, 2019 and want to use the Autoscale feature on your cluster, the recommended path is to create a new cluster, and enable Autoscale on the new cluster.

Interactive Query 自动缩放 (LLAP) 于 2020 年 8 月 27 日正式发布。Autoscale for Interactive Query (LLAP) was released for general availability on August 27th, 2020. HBase 群集仍处于预览状态。HBase clusters are still in preview. 自动缩放仅适用于 Spark、Hadoop、交互式查询和 HBase 群集。Autoscale is only available on Spark, Hadoop, Interactive Query, and HBase clusters.

下表描述了与自动缩放功能兼容的群集类型和版本。The following table describes the cluster types and versions that are compatible with the Autoscale feature.

版本Version SparkSpark HiveHive LLAPLLAP HBaseHBase KafkaKafka StormStorm MLML
不包含 ESP 的 HDInsight 3.6HDInsight 3.6 without ESP Yes Yes Yes 是*Yes* No No No
不包含 ESP 的 HDInsight 4.0HDInsight 4.0 without ESP Yes Yes Yes 是*Yes* No No No
包含 ESP 的 HDInsight 3.6HDInsight 3.6 with ESP Yes Yes Yes 是*Yes* No No No
包含 ESP 的 HDInsight 4.0HDInsight 4.0 with ESP Yes Yes Yes 是*Yes* No No No

* 只能为 HBase 群集配置基于计划的缩放,不能为其配置基于负载的缩放。* HBase clusters can only be configured for schedule-based scaling, not load-based.

入门Get started

使用基于负载的自动缩放创建群集Create a cluster with load-based Autoscaling

若要结合基于负载的缩放启用自动缩放功能,请在创建普通群集的过程中完成以下步骤:To enable the Autoscale feature with load-based scaling, complete the following steps as part of the normal cluster creation process:

  1. 在“配置 + 定价”选项卡上,选中“启用自动缩放”复选框。 On the Configuration + pricing tab, select the Enable autoscale checkbox.

  2. 在“自动缩放类型”下选择“基于负载”。Select Load-based under Autoscale type.

  3. 为以下属性输入所需的值:Enter the intended values for the following properties:

    • 适用于工作器节点的初始工作节点数。Initial Number of nodes for Worker node.
    • 工作器节点最小数目。Min number of worker nodes.
    • 工作器节点最大数目。Max number of worker nodes.

    启用工作器节点的基于负载的自动缩放

工作节点的初始数量必须介于最小值和最大值之间(含最大值和最小值)。The initial number of worker nodes must fall between the minimum and maximum, inclusive. 此值定义创建群集时的群集初始大小。This value defines the initial size of the cluster when it's created. 工作器节点最小数目至少应设置为 3。The minimum number of worker nodes should be set to three or more. 将群集缩放成少于三个节点可能导致系统停滞在安全模式下,因为没有进行充分的文件复制。Scaling your cluster to fewer than three nodes can result in it getting stuck in safe mode because of insufficient file replication. 有关详细信息,请参阅停滞在安全模式下For more information, see Getting stuck in safe mode.

使用基于计划的自动缩放创建群集Create a cluster with schedule-based Autoscaling

若要结合基于计划的缩放启用自动缩放功能,请在创建普通群集的过程中完成以下步骤:To enable the Autoscale feature with schedule-based scaling, complete the following steps as part of the normal cluster creation process:

  1. 在“配置 + 定价”选项卡上,勾选“启用自动缩放”复选框。On the Configuration + pricing tab, check the Enable autoscale checkbox.

  2. 输入工作器节点节点数,以控制纵向扩展群集的限制。Enter the Number of nodes for Worker node, which controls the limit for scaling up the cluster.

  3. 在“自动缩放类型”下选择“基于计划”选项。Select the option Schedule-based under Autoscale type.

  4. 选择“配置”以打开“自动缩放配置”窗口。 Select Configure to open the Autoscale configuration window.

  5. 选择时区,然后单击“+ 添加条件”Select your timezone and then click + Add condition

  6. 选择新条件要应用到的星期日期。Select the days of the week that the new condition should apply to.

  7. 编辑该条件生效的时间,以及群集要缩放到的节点数。Edit the time the condition should take effect and the number of nodes that the cluster should be scaled to.

  8. 根据需要添加更多条件。Add more conditions if needed.

    启用工作器节点的基于计划的创建

节点数最小为 3,最大为添加条件之前输入的最大工作器节点数。The number of nodes must be between 3 and the maximum number of worker nodes that you entered before adding conditions.

最终创建步骤Final creation steps

请在“节点大小”下的下拉列表中选择一个 VM,通过这种方式选择工作器节点的 VM 类型。Select the VM type for worker nodes by selecting a VM from the drop-down list under Node size. 为每个节点类型选择 VM 类型后,可以看到整个群集的估算成本范围。After you choose the VM type for each node type, you can see the estimated cost range for the whole cluster. 请根据预算调整 VM 类型。Adjust the VM types to fit your budget.

启用工作器节点的基于计划的自动缩放节点大小

你的订阅具有针对每个区域的容量配额。Your subscription has a capacity quota for each region. 头节点核心总数加最大工作器节点数不能超过容量配额。The total number of cores of your head nodes and the maximum worker nodes can't exceed the capacity quota. 但是,此配额是软性限制;始终可创建支持票证来轻松地增加此配额。However, this quota is a soft limit; you can always create a support ticket to get it increased easily.

备注

如果超出总核心配额限制,将收到一条错误消息,指出“最大节点数超出此区域中的可用核心数,请选择其他区域或联系客户支持以增加配额”。If you exceed the total core quota limit, You will receive an error message saying 'the maximum node exceeded the available cores in this region, please choose another region or contact the support to increase the quota.'

有关使用 Azure 门户创建 HDInsight 群集的详细信息,请参阅使用 Azure 门户在 HDInsight 中创建基于 Linux 的群集For more information on HDInsight cluster creation using the Azure portal, see Create Linux-based clusters in HDInsight using the Azure portal.

使用资源管理器模板创建群集Create a cluster with a Resource Manager template

基于负载的自动缩放Load-based autoscaling

可以使用 Azure 资源管理器模板创建支持基于负载的自动缩放的 HDInsight 群集,方法是将 autoscale 节点添加到包含属性 minInstanceCountmaxInstanceCountcomputeProfile > workernode 节,如以下 JSON 代码片段所示。You can create an HDInsight cluster with load-based Autoscaling an Azure Resource Manager template, by adding an autoscale node to the computeProfile > workernode section with the properties minInstanceCount and maxInstanceCount as shown in the json snippet below. 有关完整的资源管理器模板,请参阅快速入门模板:在启用了基于负载的自动缩放的情况下部署 Spark 群集For a complete resource manager template see Quickstart template: Deploy Spark Cluster with Loadbased Autoscale Enabled.

{
  "name": "workernode",
  "targetInstanceCount": 4,
  "autoscale": {
      "capacity": {
          "minInstanceCount": 3,
          "maxInstanceCount": 10
      }
  },
  "hardwareProfile": {
      "vmSize": "Standard_D13_V2"
  },
  "osProfile": {
      "linuxOperatingSystemProfile": {
          "username": "[parameters('sshUserName')]",
          "password": "[parameters('sshPassword')]"
      }
  },
  "virtualNetworkProfile": null,
  "scriptActions": []
}

基于计划的自动缩放Schedule-based autoscaling

可以使用 Azure 资源管理器模板创建支持基于计划的自动缩放的 HDInsight 群集,方法是将 autoscale 节点添加到 computeProfile > workernode 节。You can create an HDInsight cluster with schedule-based Autoscaling an Azure Resource Manager template, by adding an autoscale node to the computeProfile > workernode section. autoscale 节点包含 recurrence,其中的 timezoneschedule 描述了更改生效的时间。The autoscale node contains a recurrence that has a timezone and schedule that describes when the change will take place. 有关完整的资源管理器模板,请参阅在启用了基于计划的自动缩放的情况下部署 Spark 群集For a complete resource manager template, see Deploy Spark Cluster with schedule-based Autoscale Enabled.

{
  "autoscale": {
    "recurrence": {
      "timeZone": "Pacific Standard Time",
      "schedule": [
        {
          "days": [
            "Monday",
            "Tuesday",
            "Wednesday",
            "Thursday",
            "Friday"
          ],
          "timeAndCapacity": {
            "time": "11:00",
            "minInstanceCount": 10,
            "maxInstanceCount": 10
          }
        }
      ]
    }
  },
  "name": "workernode",
  "targetInstanceCount": 4
}

为正在运行的群集启用和禁用自动缩放Enable and disable Autoscale for a running cluster

使用 Azure 门户Using the Azure portal

若要在运行中的群集上启用自动缩放,请选择“设置”下的“群集大小”。To enable Autoscale on a running cluster, select Cluster size under Settings. 然后选择“启用自动缩放”。Then select Enable autoscale. 选择所需的自动缩放类型,然后输入基于负载或基于计划的缩放选项。Select the type of Autoscale that you want and enter the options for load-based or schedule-based scaling. 最后,选择“保存”。Finally, select Save.

启用工作器节点的基于计划的自动缩放运行群集

使用 REST APIUsing the REST API

若要使用 REST API 在运行中的群集上启用或禁用自动缩放,请向自动缩放终结点发出 POST 请求:To enable or disable Autoscale on a running cluster using the REST API, make a POST request to the Autoscale endpoint:

https://management.azure.com/subscriptions/{subscription Id}/resourceGroups/{resourceGroup Name}/providers/Microsoft.HDInsight/clusters/{CLUSTERNAME}/roles/workernode/autoscale?api-version=2018-06-01-preview

请在请求有效负载中使用适当的参数。Use the appropriate parameters in the request payload. 下面的 json 有效负载可以用来启用自动缩放。The json payload below could be used to enable Autoscale. 使用有效负载 {autoscale: null} 禁用自动缩放。Use the payload {autoscale: null} to disable Autoscale.

{ "autoscale": { "capacity": { "minInstanceCount": 3, "maxInstanceCount": 5 } } }

请参阅介绍如何启用基于负载的自动缩放的上一部分,详尽了解所有的有效负载参数。See the previous section on enabling load-based autoscale for a full description of all payload parameters.

监视自动缩放活动Monitoring Autoscale activities

群集状态Cluster status

Azure 门户中列出的群集状态可帮助你监视自动缩放活动。The cluster status listed in the Azure portal can help you monitor Autoscale activities.

启用工作器节点的基于负载的自动缩放群集状态

以下列表解释了你可能会看到的所有群集状态消息。All of the cluster status messages that you might see are explained in the list below.

群集状态Cluster status 说明Description
正在运行Running 群集在正常运行。The cluster is operating normally. 所有以前的自动缩放活动已成功完成。All of the previous Autoscale activities have completed successfully.
更新Updating 正在更新群集自动缩放配置。The cluster Autoscale configuration is being updated.
HDInsight 配置HDInsight configuration 某个群集纵向扩展或缩减操作正在进行。A cluster scale up or scale down operation is in progress.
更新时出错Updating Error 更新自动缩放配置期间 HDInsight 遇到问题。HDInsight met issues during the Autoscale configuration update. 客户可以选择重试更新或禁用自动缩放。Customers can choose to either retry the update or disable autoscale.
错误Error 群集发生错误且不可用。Something is wrong with the cluster, and it isn't usable. 请删除此群集,然后新建一个。Delete this cluster and create a new one.

若要查看群集中当前的节点数,请转到群集“概览”页上的“群集大小”图表。 To view the current number of nodes in your cluster, go to the Cluster size chart on the Overview page for your cluster. 或者在“设置”下选择“群集大小”。 Or select Cluster size under Settings.

操作历史记录Operation history

可查看群集指标中包含的群集增加和减少历史记录。You can view the cluster scale-up and scale-down history as part of the cluster metrics. 还可以列出过去一天、过去一周或其他时间段的所有缩放操作。You can also list all scaling actions over the past day, week, or other period of time.

在“监视”下选择“指标”。 Select Metrics under Monitoring. 然后选择“添加指标”,并从“指标”下拉框中选择“活动辅助角色数”。 Then select Add metric and Number of Active Workers from the Metric dropdown box. 选择右上角的按钮来更改时间范围。Select the button in the upper right to change the time range.

启用工作器节点的基于计划的自动缩放指标

最佳做法Best practices

请考虑纵向扩展操作和纵向缩减操作的延迟Consider the latency of scale up and scale down operations

完成一项缩放操作可能需要 10 到 20 分钟。It can take 10 to 20 minutes for a scaling operation to complete. 设置自定义计划时,请将此延迟计划在内。When setting up a customized schedule, plan for this delay. 例如,如果需要在早晨 9:00 将群集大小设置为 20,请将计划触发器设置为更早的某个时间(例如早晨 8:30),这样缩放操作就可以在早晨 9:00 之前完成。For example, if you need the cluster size to be 20 at 9:00 AM, set the schedule trigger to an earlier time such as 8:30 AM so that the scaling operation has completed by 9:00 AM.

准备进行纵向缩减Prepare for scaling down

在群集纵向缩减过程中,自动缩放会根据目标大小解除节点的授权。During the cluster scaling down process, Autoscale decommissions the nodes to meet the target size. 如果这些节点上有正在运行的任务,自动缩放会等待这些任务完成。If tasks are running on those nodes, Autoscale waits until the tasks are completed. 由于每个工作器节点也充当 HDFS 中的某个角色,因此会将临时数据转移到剩余节点中。Since each worker node also serves a role in HDFS, the temporary data is shifted to the remaining nodes. 请确保剩余节点上有足够的空间来托管所有临时数据。Make sure there's enough space on the remaining nodes to host all temporary data.

正在运行的作业将继续运行。The running jobs will continue. 在可用的工作器节点变少的情况下,挂起的作业会等待安排。The pending jobs will wait for scheduling with fewer available worker nodes.

了解最小的群集大小Be aware of the minimum cluster size

请勿将群集缩减到三个节点以下。Don't scale your cluster down to fewer than three nodes. 将群集缩放成少于三个节点可能导致系统停滞在安全模式下,因为没有进行充分的文件复制。Scaling your cluster to fewer than three nodes can result in it getting stuck in safe mode because of insufficient file replication. 有关详细信息,请参阅停滞在安全模式下For more information, see getting stuck in safe mode.

增加映射器和减速器的数目Increase the number of mappers and reducers

适用于 Hadoop 群集的自动缩放功能也会监视 HDFS 使用情况。Autoscale for Hadoop clusters also monitors HDFS usage. 如果 HDFS 非常繁忙,它会认为该群集仍需要当前资源。If the HDFS is busy, it assumes the cluster still needs the current resources. 如果查询中涉及大量数据,可增加映射器和减速器的数量,以提高并行度并加速 HDFS 操作。When there is massive data involved in the query, you can increase the number of mappers and reducers to increase the parallelism and accelerate the HDFS operations. 这样一来,当有额外资源时,会触发适当的纵向缩减。In this way, proper scaling down will be triggered when there are extra resources.

针对峰值使用方案设置名为“最大并发查询总数”的 Hive 配置Set the Hive configuration Maximum Total Concurrent Queries for the peak usage scenario

自动缩放事件不会更改 Ambari 中名为“最大并发查询总数”的 Hive 配置。Autoscale events don't change the Hive configuration Maximum Total Concurrent Queries in Ambari. 这意味着,即使 LLAP 守护程序计数根据负载和计划进行纵向扩展和纵向缩减,Hive Server 2 交互式服务在任意时间点都只能处理指定数量的并发查询。This means that the Hive Server 2 Interactive Service can handle only the given number of concurrent queries at any point of time even if the LLAP daemons count are scaled up and down based on load and schedule. 通常建议针对峰值使用方案设置此配置,以避免手动干预。The general recommendation is to set this configuration for the peak usage scenario to avoid manual intervention.

但是,如果只有少量的工作器节点,并且最大并发查询总数的值配置过高,则可能会出现 Hive Server 2 重启失败的情况。However, you may experience a Hive Server 2 restart failure if there are only a small number of worker nodes and the value for maximum total concurrent queries is configured too high. 至少需要可容纳给定数量的 Tez Ams 的最小工作器节点数(等于最大并发查询配置总数)。At a minimum, you need the minimum number of worker nodes that can accommodate the given number of Tez Ams (equal to the Maximum Total Concurrent Queries configuration).

限制Limitations

节点标签文件缺失Node label file missing

HDInsight 自动缩放使用节点标签文件来确定节点是否准备好执行任务。HDInsight Autoscale uses a node label file to determine whether a node is ready to execute tasks. 节点标签文件存储在具有 3 个副本的 HDFS 中。The node label file is stored on HDFS with three replicas. 如果群集大小大幅度地纵向缩减,并且有大量的临时数据,则全部三个副本都可能会被删除。If the cluster size is dramatically scaled down and there is a large amount of temporary data, there is a small chance that all three replicas could be dropped. 如果发生这种情况,群集将进入错误状态。If this happens, the cluster enters an error state.

LLAP 守护程序计数LLAP Daemons count

如果 LLAP 群集启用了自动缩放,则自动纵向扩展/缩减事件还会将 LLAP 守护程序的数量纵向扩展/缩减为活动工作器节点的数量。In case of autoscae-enabled LLAP clusters, an autoscale up/down event also scales up/down the number of LLAP daemons to the number of active worker nodes. 守护进程数量的变化不会保存在 Ambari 的 num_llap_nodes 配置中。The change in the number of daemons is not persisted in the num_llap_nodes configuration in Ambari. 如果手动重启 Hive 服务,则 LLAP 守护程序的数量将根据 Ambari 中的配置进行重置。If Hive services are restarted manually, the number of LLAP daemons is reset as per the configuration in Ambari.

如果手动重启 LLAP 服务,则需要手动更改“高级 hive-interactive-env”下的 num_llap_node 配置(运行 Hive LLAP 守护程序所需的节点数),使其与当前的活动工作器节点数一致。If the LLAP service is manually restarted, you need to manually change the num_llap_node configuration (the number of node(s) needed to run the Hive LLAP daemon) under Advanced hive-interactive-env to match the current active worker node count.

后续步骤Next steps

阅读缩放准则,了解有关手动缩放群集的准则Read about guidelines for scaling clusters manually in Scaling guidelines