调试 Azure HDInsight 中运行的 Apache Spark 作业Debug Apache Spark jobs running on Azure HDInsight

在本文中,将了解如何使用 Apache Hadoop YARN UI、Spark UI 和 Spark 历史记录服务器来跟踪和调试 HDInsight 群集上运行的 Apache Spark 作业。In this article, you learn how to track and debug Apache Spark jobs running on HDInsight clusters using the Apache Hadoop YARN UI, Spark UI, and the Spark History Server. 使用 Spark 群集中提供的笔记本启动 Spark 作业,相关信息请参阅“机器学习: 使用 MLLib 对食物检测数据进行预测分析”。You start a Spark job using a notebook available with the Spark cluster, Machine learning: Predictive analysis on food inspection data using MLLib. 也可以执行以下步骤来跟踪使用任何其他方法(例如 spark-submit)提交的应用程序 。You can use the following steps to track an application that you submitted using any other approach as well, for example, spark-submit.

先决条件Prerequisites

必须满足以下条件:You must have the following:

在 YARN UI 中跟踪应用程序Track an application in the YARN UI

  1. 启动 YARN UI。Launch the YARN UI. 在“群集仪表板”下单击“Yarn” 。Click Yarn under Cluster dashboards.

    启动 YARN UI

    提示

    或者,也可以从 Ambari UI 启动 YARN UI。Alternatively, you can also launch the YARN UI from the Ambari UI. 若要启动 Ambari UI,请在“群集仪表板”下单击“Ambari 主页” 。To launch the Ambari UI, click Ambari home under Cluster dashboards. 在 Ambari UI 中依次单击“YARN” 、“快速链接” 、活动的资源管理器和“资源管理器 UI” 。From the Ambari UI, click YARN, click Quick Links, click the active Resource Manager, and then click Resource Manager UI.

  2. 由于 Spark 作业是使用 Jupyter Notebook 启动的,因此应用程序的名称为 remotesparkmagics(这是从 Notebook 启动的所有应用程序的名称)。Because you started the Spark job using Jupyter notebooks, the application has the name remotesparkmagics (this is the name for all applications that are started from the notebooks). 单击应用程序名称旁边的应用程序 ID,以获取有关该作业的详细信息。Click the application ID against the application name to get more information about the job. 此时将启动应用程序视图。This launches the application view.

    查找 Spark 应用程序 ID

    对于从 Jupyter Notebook 启动的应用程序,在退出 Notebook 之前,其状态始终是“正在运行” 。For such applications that are launched from the Jupyter notebooks, the status is always RUNNING until you exit the notebook.

  3. 从应用程序视图中,可以进一步深入以找到与应用程序和日志 (stdout/stderr) 关联的容器。From the application view, you can drill down further to find out the containers associated with the application and the logs (stdout/stderr). 也可以通过单击“跟踪 URL” 对应的链接来启动 Spark UI,如下所示。You can also launch the Spark UI by clicking the linking corresponding to the Tracking URL, as shown below.

    下载容器日志

在 Spark UI 中跟踪应用程序Track an application in the Spark UI

在 Spark UI 中,可以深入到前面启动的应用程序所产生的 Spark 作业。In the Spark UI, you can drill down into the Spark jobs that are spawned by the application you started earlier.

  1. 若要启动 Spark UI,请在应用程序视图中,单击“跟踪 URL” 旁边的链接,如上面的屏幕截图中所示。To launch the Spark UI, from the application view, click the link against the Tracking URL, as shown in the screen capture above. 可以看到,应用程序启动的所有 Spark 作业正在 Jupyter Notebook 中运行。You can see all the Spark jobs that are launched by the application running in the Jupyter notebook.

    查看 Spark 作业

  2. 单击“执行器” 选项卡以查看每个执行器的处理和存储信息。Click the Executors tab to see processing and storage information for each executor. 还可以通过单击“线程转储” 链接来检索调用堆栈。You can also retrieve the call stack by clicking on the Thread Dump link.

    查看 Spark 执行器

  3. 单击“阶段” 选项卡以查看与应用程序关联的阶段。Click the Stages tab to see the stages associated with the application.

    查看 Spark 阶段

    每个阶段可能有多个任务,你可以查看这些任务的执行统计信息,如下所示。Each stage can have multiple tasks for which you can view execution statistics, like shown below.

    查看 Spark 阶段

  4. 在阶段详细信息页上,可以启动 DAG 可视化。From the stage details page, you can launch DAG Visualization. 展开页面顶部的“DAG 可视化” 链接,如下所示。Expand the DAG Visualization link at the top of the page, as shown below.

    查看 Spark 阶段 DAG 可视化

    DAG (Direct Aclyic Graph) 呈现了应用程序中的不同阶段。DAG or Direct Aclyic Graph represents the different stages in the application. 图形中的每个蓝框表示从应用程序调用的 Spark 操作。Each blue box in the graph represents a Spark operation invoked from the application.

  5. 在阶段详细信息页上,还可以启动应用程序时间线视图。From the stage details page, you can also launch the application timeline view. 展开页面顶部的“事件时间线” 链接,如下所示。Expand the Event Timeline link at the top of the page, as shown below.

    查看 Spark 阶段事件时间线

    此时将以时间线形式显示 Spark 事件。This displays the Spark events in the form of a timeline. 时间线视图提供三个级别:跨作业、作业内和阶段内。The timeline view is available at three levels, across jobs, within a job, and within a stage. 上图中捕获了指定阶段的时间线视图。The image above captures the timeline view for a given stage.

    提示

    如果选中“启用缩放” 复选框,则可以在时间线视图中左右滚动。If you select the Enable zooming check box, you can scroll left and right across the timeline view.

  6. Spark UI 中的其他选项卡也提供了有关 Spark 实例的有用信息。Other tabs in the Spark UI provide useful information about the Spark instance as well.

    • “存储”选项卡 - 如果应用程序创建了 RDD,你可以在“存储”选项卡中找到相关信息。Storage tab - If your application creates an RDDs, you can find information about those in the Storage tab.
    • “环境”选项卡 - 此选项卡提供有关 Spark 实例的有用信息,例如Environment tab - This tab provides a lot of useful information about your Spark instance such as the
      • Scala 版本Scala version
      • 与群集关联的事件日志目录Event log directory associated with the cluster
      • 应用程序的执行器核心数Number of executor cores for the application
      • 等等Etc.

使用 Spark History Server 查找有关已完成的作业的信息Find information about completed jobs using the Spark History Server

完成某个作业后,有关该作业的信息将保存在 Spark History Server 中。Once a job is completed, the information about the job is persisted in the Spark History Server.

  1. 要启动 Spark History Server,请在“概述”边栏选项卡的“群集仪表板”下单击“Spark History Server” 。To launch the Spark History Server, from the Overview blade, click Spark history server under Cluster dashboards.

    启动 Spark History Server

    提示

    或者,也可以从 Ambari UI 启动 Spark History Server UI。Alternatively, you can also launch the Spark History Server UI from the Ambari UI. 要启动 Ambari UI,请在“概述”边栏选项卡的“群集仪表板”下单击“Ambari 主页” 。To launch the Ambari UI, from the Overview blade, click Ambari home under Cluster dashboards. 在 Ambari UI 中,依次单击“Spark” 、“快速链接” 和“Spark History Server UI” 。From the Ambari UI, click Spark, click Quick Links, and then click Spark History Server UI.

  2. 随后会看到已列出所有已完成的应用程序。You see all the completed applications listed. 单击应用程序 ID 可深入到该应用程序以获取更多信息。Click an application ID to drill down into an application for more info.

    启动 Spark History Server

另请参阅See also

适用于数据分析师For data analysts

适用于 Spark 开发人员For Spark developers