方案:Apache Hive 日志即将填满 Azure HDInsight 的头节点的磁盘空间。Scenario: Apache Hive logs are filling up the disk space on the Head nodes in Azure HDInsight

本文介绍了与 Azure HDInsight 群集的头节点上的磁盘空间不足相关的问题的故障排除步骤和可能的解决方案。This article describes troubleshooting steps and possible resolutions for problems related to insufficient disk space on the head nodes in Azure HDInsight clusters.

问题Issue

在 Apache Hive/LLAP 群集上,不需要的日志占用了头节点上的整个磁盘空间。On an Apache Hive/LLAP cluster, unwanted logs are taking up the entire disk space on the head nodes. 这种情况可能会导致以下问题:This condition could cause the following problems:

  • 由于头节点上没有剩余空间,SSH 访问失败。SSH access fails because no space is left on the head node.
  • Ambari 引发“HTTP 错误:503 服务不可用”。Ambari throws HTTP ERROR: 503 Service Unavailable.
  • HiveServer2 Interactive 无法重启。HiveServer2 Interactive fails to restart.

发生此问题时,ambari-agent 日志会包括以下条目:The ambari-agent logs will include the following entries when the problem happens:

ambari_agent - Controller.py - [54697] - Controller - ERROR - Error:[Errno 28] No space left on device
ambari_agent - HostCheckReportFileHandler.py - [54697] - ambari_agent.HostCheckReportFileHandler - ERROR - Can't write host check file at /var/lib/ambari-agent/data/hostcheck.result

原因Cause

在高级 Hive log4j 配置中,当前的默认删除计划是根据上次修改日期删除 30 天以前的文件。In advanced Hive log4j configurations, the current default deletion schedule is to delete files older than 30 days, based on the last-modified date.

解决方法Resolution

  1. 在 Ambari 门户中转到 Hive 组件摘要,然后选择“配置”选项卡。Go to the Hive component summary on the Ambari portal and select the Configs tab.

  2. 转到“高级设置”中的 Advanced hive-log4j 部分。Go to the Advanced hive-log4j section in Advanced settings.

  3. appender.RFA.strategy.action.condition.age 参数设置为所选的期限。Set the appender.RFA.strategy.action.condition.age parameter to an age of your choice. 此示例会将期限设置为 14 天:appender.RFA.strategy.action.condition.age = 14DThis example will set the age to 14 days: appender.RFA.strategy.action.condition.age = 14D

  4. 如果看不到任何相关设置,请追加这些设置:If you don't see any related settings, append these settings:

    # automatically delete hive log
    appender.RFA.strategy.action.type = Delete
    appender.RFA.strategy.action.basePath = ${sys:hive.log.dir}
    appender.RFA.strategy.action.condition.type = IfLastModified
    appender.RFA.strategy.action.condition.age = 30D
    appender.RFA.strategy.action.PathConditions.type = IfFileName
    appender.RFA.strategy.action.PathConditions.regex = hive*.*log.*
    
  5. hive.root.logger 设置为 INFO,RFA,如以下示例所示。Set hive.root.logger to INFO,RFA, as shown in the following example. 默认设置为 DEBUG,这会使日志很大。The default setting is DEBUG, which makes the logs large.

    # Define some default values that can be overridden by system properties
    hive.log.threshold=ALL
    hive.root.logger=INFO,RFA
    hive.log.dir=${java.io.tmpdir}/${user.name}
    hive.log.file=hive.log
    
  6. 保存配置并重启所需的组件。Save the configurations and restart the required components.

后续步骤Next steps

如果你的问题未在本文中列出,或者无法解决问题,请访问以下渠道以获取更多支持:If you didn't see your problem or are unable to solve your issue, visit the following channel for more support:

  • 如果需要更多帮助,可以从 Azure 门户提交支持请求。If you need more help, you can submit a support request from the Azure portal.