调试 Azure HDInsight 中运行的 Apache Spark 作业Debug Apache Spark jobs running on Azure HDInsight

本文介绍如何跟踪和调试在 HDInsight 群集上运行的 Apache Spark 作业。In this article, you learn how to track and debug Apache Spark jobs running on HDInsight clusters. 使用 Apache Hadoop YARN UI、Spark UI 和 Spark History Server 进行调试。Debug using the Apache Hadoop YARN UI, Spark UI, and the Spark History Server. 使用 Spark 群集中提供的笔记本启动 Spark 作业,相关信息请参阅“机器学习:使用 MLLib 对食物检测数据进行预测分析”。You start a Spark job using a notebook available with the Spark cluster, Machine learning: Predictive analysis on food inspection data using MLLib. 按以下步骤跟踪使用任何其他方法(例如 spark-submit)提交的应用程序。Use the following steps to track an application that you submitted using any other approach as well, for example, spark-submit.

如果没有 Azure 订阅,可在开始前创建一个试用订阅If you don't have an Azure subscription, create a Trial Subscription before you begin.

先决条件Prerequisites

在 YARN UI 中跟踪应用程序Track an application in the YARN UI

  1. 启动 YARN UI。Launch the YARN UI. 在“群集仪表板”下选择“Yarn” 。Select Yarn under Cluster dashboards.

    Azure 门户启动 YARN UI

    提示

    或者,也可以从 Ambari UI 启动 YARN UI。Alternatively, you can also launch the YARN UI from the Ambari UI. 若要启动 Ambari UI,请在“群集仪表板”下选择“Ambari 主页” 。To launch the Ambari UI, select Ambari home under Cluster dashboards. 在 Ambari UI 中,导航到“YARN” > “快速链接”>“活动资源管理器”>“资源管理器 UI”。From the Ambari UI, navigate to YARN > Quick Links > the active Resource Manager > Resource Manager UI.

  2. 由于 Spark 作业是使用 Jupyter 笔记本启动的,因此应用程序的名称为“remotesparkmagics”(从笔记本启动的所有应用程序的名称)。Because you started the Spark job using Jupyter notebooks, the application has the name remotesparkmagics (the name for all applications started from the notebooks). 根据应用程序名称选择应用程序 ID,以获取有关该作业的详细信息。Select the application ID against the application name to get more information about the job. 此操作会启动应用程序视图。This action launches the application view.

    Spark History Server - 查找 Spark 应用程序 ID

    对于从 Jupyter Notebook 启动的应用程序,在退出 Notebook 之前,其状态始终是“正在运行”。For such applications that are launched from the Jupyter notebooks, the status is always RUNNING until you exit the notebook.

  3. 从应用程序视图中,可以进一步深入以找到与应用程序和日志 (stdout/stderr) 关联的容器。From the application view, you can drill down further to find out the containers associated with the application and the logs (stdout/stderr). 也可以通过单击“跟踪 URL”对应的链接来启动 Spark UI,如下所示。You can also launch the Spark UI by clicking the linking corresponding to the Tracking URL, as shown below.

    Spark History Server - 下载容器日志

在 Spark UI 中跟踪应用程序Track an application in the Spark UI

在 Spark UI 中,可以深入到前面启动的应用程序所产生的 Spark 作业。In the Spark UI, you can drill down into the Spark jobs that are spawned by the application you started earlier.

  1. 若要启动 Spark UI,请在应用程序视图中选择针对“跟踪 URL”的链接,如上面的屏幕截图所示。To launch the Spark UI, from the application view, select the link against the Tracking URL, as shown in the screen capture above. 可以看到,应用程序启动的所有 Spark 作业正在 Jupyter Notebook 中运行。You can see all the Spark jobs that are launched by the application running in the Jupyter notebook.

    Spark History Server 的“作业”选项卡

  2. 选择“执行程序”选项卡以查看每个执行程序的处理和存储信息。Select the Executors tab to see processing and storage information for each executor. 还可以通过选择“线程转储”链接来检索调用堆栈。You can also retrieve the call stack by selecting the Thread Dump link.

    Spark History Server 的“执行程序”选项卡

  3. 选择“阶段”选项卡以查看与应用程序关联的阶段。Select the Stages tab to see the stages associated with the application.

    Spark History Server 的“阶段”选项卡Spark history server stages tab

    每个阶段可能有多个任务,你可以查看这些任务的执行统计信息,如下所示。Each stage can have multiple tasks for which you can view execution statistics, like shown below.

    Spark History Server 的“阶段”选项卡详细信息Spark history server stages tab details

  4. 在阶段详细信息页上,可以启动 DAG 可视化。From the stage details page, you can launch DAG Visualization. 展开页面顶部的“DAG 可视化”链接,如下所示。Expand the DAG Visualization link at the top of the page, as shown below.

    查看 Spark 阶段 DAG 可视化

    DAG (Direct Aclyic Graph) 呈现了应用程序中的不同阶段。DAG or Direct Aclyic Graph represents the different stages in the application. 图形中的每个蓝框表示从应用程序调用的 Spark 操作。Each blue box in the graph represents a Spark operation invoked from the application.

  5. 在阶段详细信息页上,还可以启动应用程序时间线视图。From the stage details page, you can also launch the application timeline view. 展开页面顶部的“事件时间线”链接,如下所示。Expand the Event Timeline link at the top of the page, as shown below.

    查看 Spark 阶段事件时间线

    此图像以时间线形式显示 Spark 事件。This image displays the Spark events in the form of a timeline. 时间线视图提供三个级别:跨作业、作业内和阶段内。The timeline view is available at three levels, across jobs, within a job, and within a stage. 上图中捕获了指定阶段的时间线视图。The image above captures the timeline view for a given stage.

    提示

    如果选中“启用缩放”复选框,则可以在时间线视图中左右滚动。If you select the Enable zooming check box, you can scroll left and right across the timeline view.

  6. Spark UI 中的其他选项卡也提供了有关 Spark 实例的有用信息。Other tabs in the Spark UI provide useful information about the Spark instance as well.

    • “存储”选项卡 - 如果应用程序创建了 RDD,则可以在“存储”选项卡中找到相关信息。Storage tab - If your application creates an RDD, you can find information in the Storage tab.
    • “环境”选项卡 - 此选项卡提供有关 Spark 实例的有用信息,例如:Environment tab - This tab provides useful information about your Spark instance such as the:
      • Scala 版本Scala version
      • 与群集关联的事件日志目录Event log directory associated with the cluster
      • 应用程序的执行器核心数Number of executor cores for the application

使用 Spark History Server 查找有关已完成的作业的信息Find information about completed jobs using the Spark History Server

完成某个作业后,有关该作业的信息将保存在 Spark History Server 中。Once a job is completed, the information about the job is persisted in the Spark History Server.

  1. 若要启动 Spark History Server,请在“概览”页的“群集仪表板”下选择“Spark History Server” 。To launch the Spark History Server, from the Overview page, select Spark history server under Cluster dashboards.

    在 Azure 门户中启动 Spark History ServerAzure portal launch Spark history server

    提示

    或者,也可以从 Ambari UI 启动 Spark History Server UI。Alternatively, you can also launch the Spark History Server UI from the Ambari UI. 若要启动 Ambari UI,请在“概览”边栏选项卡的“群集仪表板”下选择“Ambari 主页” 。To launch the Ambari UI, from the Overview blade, select Ambari home under Cluster dashboards. 在 Ambari UI 中,导航到“Spark2” > “快速链接” > “Spark2 History Server UI”。From the Ambari UI, navigate to Spark2 > Quick Links > Spark2 History Server UI.

  2. 随后会看到已列出所有已完成的应用程序。You see all the completed applications listed. 选择应用程序 ID 可深入到该应用程序中获取更多信息。Select an application ID to drill down into an application for more info.

    Spark History Server 的已完成应用程序Spark history server completed applications

另请参阅See also