Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This article provides various guidelines for using Apache Spark on Azure HDInsight.
Option | Documents |
---|---|
Azure Toolkit for IntelliJ | Failure spark job debugging with Azure Toolkit for IntelliJ (preview) |
Azure Toolkit for IntelliJ through SSH | Debug Apache Spark applications locally or remotely on an HDInsight cluster with Azure Toolkit for IntelliJ through SSH |
Azure Toolkit for IntelliJ through VPN | Use Azure Toolkit for IntelliJ to debug Apache Spark applications remotely in HDInsight through VPN |
Job graph on Apache Spark History Server | Use extended Apache Spark History Server to debug and diagnose Apache Spark applications |
Option | Documents |
---|---|
IO Cache | Improve performance of Apache Spark workloads using Azure HDInsight IO Cache (Preview) |
Configuration options | Optimize Apache Spark jobs |
Option | Documents |
---|---|
Apache Hive on HDInsight | Integrate Apache Spark and Apache Hive with the Hive Warehouse Connector |
Apache HBase on HDInsight | Use Apache Spark to read and write Apache HBase data |
Apache Kafka on HDInsight | Tutorial: Use Apache Spark Structured Streaming with Apache Kafka on HDInsight |
Azure Cosmos DB | Azure Synapse Link for Azure Cosmos DB |
Option | Documents |
---|---|
Azure Data Lake Storage Gen2 | Use Azure Data Lake Storage Gen2 with Azure HDInsight clusters |
Azure Blob Storage | Use Azure storage with Azure HDInsight clusters |