解决 Azure HDInsight 中的 Apache Hive 内存不足错误Fix an Apache Hive out of memory error in Azure HDInsight

了解处理大型表时如何通过配置 Hive 内存设置解决 Apache Hive 内存不足 (OOM) 错误。Learn how to fix an Apache Hive out of memory (OOM) error when processing large tables by configuring Hive memory settings.

对大型表运行 Apache Hive 查询Run Apache Hive query against large tables

客户运行了 Hive 查询:A customer ran a Hive query:

SELECT
    COUNT (T1.COLUMN1) as DisplayColumn1,
    …
    …
    ….
FROM
    TABLE1 T1,
    TABLE2 T2,
    TABLE3 T3,
    TABLE5 T4,
    TABLE6 T5,
    TABLE7 T6
where (T1.KEY1 = T2.KEY1….
    …
    …

此查询有一些繁琐之处:Some nuances of this query:

  • T1 是大型表 TABLE1 的别名,其中包含多个 STRING 列类型。T1 is an alias to a large table, TABLE1, which has lots of STRING column types.
  • 其他表没有那么大,但包含许多列。Other tables aren't that large but do have many columns.
  • 所有表都彼此联接,在某些情况下,TABLE1 和其他表中的多个列也相互联接。All tables are joining each other, in some cases with multiple columns in TABLE1 and others.

Hive 查询在 24 节点 A3 HDInsight 群集上用了 26 分钟才完成。The Hive query took 26 minutes to finish on a 24 node A3 HDInsight cluster. 客户注意到以下警告消息:The customer noticed the following warning messages:

Warning: Map Join MAPJOIN[428][bigTable=?] in task 'Stage-21:MAPRED' is a cross product
Warning: Shuffle Join JOIN[8][tables = [t1933775, t1932766]] in Stage 'Stage-4:MAPRED' is a cross product

通过使用 Apache Tez 执行引擎,By using the Apache Tez execution engine. 相同的查询运行了 15 分钟,然后引发以下错误:The same query ran for 15 minutes, and then threw the following error:

Status: Failed
Vertex failed, vertexName=Map 5, vertexId=vertex_1443634917922_0008_1_05, diagnostics=[Task failed, taskId=task_1443634917922_0008_1_05_000006, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space
    at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:172)
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)
    at
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
    at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
    at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
    at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
    at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space

使用更大的虚拟机(例如,D12)时,也出现了该错误。The error remains when using a bigger virtual machine (for example, D12).

调试内存不足错误Debug the out of memory error

我们的支持团队和工程团队合作发现了造成内存不足错误的原因之一是 Apache JIRA 中所述的已知问题Our support and engineering teams together found one of the issues causing the out of memory error was a known issue described in the Apache JIRA:

When hive.auto.convert.join.noconditionaltask = true we check noconditionaltask.size and if the sum  of tables sizes in the map join is less than noconditionaltask.size the plan would generate a Map join, the issue with this is that the calculation doesn't take into account the overhead introduced by different HashTable implementation as results if the sum of input sizes is smaller than the noconditionaltask size by a small margin queries will hit OOM.

hive-site.xml 文件中的 Hive.auto.convert.join.noconditionaltask 已设置为 trueThe hive.auto.convert.join.noconditionaltask in the hive-site.xml file was set to true:

<property>
    <name>hive.auto.convert.join.noconditionaltask</name>
    <value>true</value>
    <description>
            Whether Hive enables the optimization about converting common join into mapjoin based on the input file size.
            If this parameter is on, and the sum of size for n-1 of the tables/partitions for a n-way join is smaller than the
            specified size, the join is directly converted to a mapjoin (there is no conditional task).
    </description>
</property>

映射联接很可能是 Java 堆空间内存不足错误的原因。It's likely map join was the cause of the Java Heap Space out of memory error. 如博客文章 HDInsight 中的 Hadoop Yarn 内存设置所述,使用 Tez 执行引擎时,所用的堆空间事实上属于 Tez 容器。As explained in the blog post Hadoop Yarn memory settings in HDInsight, when Tez execution engine is used the heap space used actually belongs to the Tez container. 请参阅下图,其中描述了 Tez 容器内存。See the following image describing the Tez container memory.

Tez 容器内存示意图:Hive 内存不足错误

如该博客文章中所述,以下两项内存设置定义了堆的容器内存:hive.tez.container.sizehive.tez.java.optsAs the blog post suggests, the following two memory settings define the container memory for the heap: hive.tez.container.size and hive.tez.java.opts. 从我们的经验来看,内存不足异常并不意味着容器太小,From our experience, the out of memory exception does not mean the container size is too small. 而是表示 Java 堆大小 (hive.tez.java.opts) 太小。It means the Java heap size (hive.tez.java.opts) is too small. 因此,每当看到内存不足时,可尝试增大 hive.tez.java.optsSo whenever you see out of memory, you can try to increase hive.tez.java.opts. 必要时,可能需要增大 hive.tez.container.sizeIf needed you might have to increase hive.tez.container.size. java.opts 设置应该大约为 container.size 的 80%。The java.opts setting should be around 80% of container.size.

Note

hive.tez.java.opts 设置必须始终小于 hive.tez.container.sizeThe setting hive.tez.java.opts must always be smaller than hive.tez.container.size.

由于 D12 计算机具有 28GB 内存,因此我们决定使用 10GB (10240MB) 的容器大小并将 80% 分配给 java.opts:Because a D12 machine has 28GB memory, we decided to use a container size of 10GB (10240MB) and assign 80% to java.opts:

SET hive.tez.container.size=10240
SET hive.tez.java.opts=-Xmx8192m

使用新设置,查询可在 10 分钟内成功运行。With the new settings, the query successfully ran in under 10 minutes.

后续步骤Next steps

遇到 OOM 错误不一定表示容器太小。Getting an OOM error doesn't necessarily mean the container size is too small. 相反地,应该配置内存设置,以便将堆大小增加为至少是容器内存大小的 80%。Instead, you should configure the memory settings so that the heap size is increased and is at least 80% of the container memory size. 有关优化 Hive 查询,请参阅在 HDInsight 中优化 Apache Hadoop 的 Apache Hive 查询For optimizing Hive queries, see Optimize Apache Hive queries for Apache Hadoop in HDInsight.