Manage disk space in Azure HDInsight

This article describes troubleshooting steps and possible resolutions for issues when interacting with Azure HDInsight clusters.

Hive log configurations

  1. From a web browser, navigate to https://CLUSTERNAME.azurehdinsight.cn, where CLUSTERNAME is the name of your cluster.

  2. Navigate to Hive > Configs > Advanced > Advanced hive-log4j. Review the following settings:

    • hive.root.logger=DEBUG,RFA. This is the default value, modify the log level to INFO to print fewer logs entries.

    • log4jhive.log.maxfilesize=1024MB. This is the default value, modify as desired.

    • log4jhive.log.maxbackupindex=10. This is the default value, modify as desired. If the parameter has been omitted, the generated log files will be endless.

Yarn log configurations

Review the following configurations:

  • Apache Ambari

    1. From a web browser, navigate to https://CLUSTERNAME.azurehdinsight.cn, where CLUSTERNAME is the name of your cluster.

    2. Navigate to Hive > Configs > Advanced > Resource Manager. Ensure Enable Log Aggregation is checked. If disabled, name nodes will keep the logs locally and not aggregate them in remote store on application completion or termination.

  • Ensure that the cluster size is appropriate for the workload. The workload might have changed recently or the cluster might have been resized. Scale up the cluster to match a higher workload.

  • /mnt/resource might be filled with orphaned files (as if Resource Manager restart). If necessary, manually clean /mnt/resource/hadoop/yarn/log and /mnt/resource/hadoop/yarn/local.

Next steps

If you didn't see your problem, or are unable to solve your issue, visit one of the following channels for more support:

  • If you need more help, you can submit a support request from the Azure portal.