Scenario: Poor performance in Apache Hive LLAP queries in Azure HDInsight
This article describes troubleshooting steps and possible resolutions for issues when using Interactive Query components in Azure HDInsight clusters.
The default cluster configurations are not sufficiently tuned for your workload. Queries in Hive LLAP are executing slower than expected.
This can happen due to a variety of reasons.
LLAP is optimized for queries that involve joins and aggregates. Queries like the following don't perform well in an Interactive Hive cluster:
select * from table where column = "columnvalue"
To improve point query performance in Hive LLAP, set the following configurations:
hive.llap.io.enabled=false; (disable LLAP IO)
hive.optimize.index.filter=false; (disable ORC row index)
hive.exec.orc.split.strategy=BI; (to avoid recombining splits)
You can also increase usage the LLAP cache to improve performance with the following configuration change:
hive.fetch.task.conversion=none
If you didn't see your problem or are unable to solve your issue, visit one of the following channels for more support:
- If you need more help, you can submit a support request from the Azure portal. Select Support from the menu bar or open the Help + support hub.