排查 Python 笔记本无响应或取消命令的问题Troubleshooting unresponsive Python notebooks or canceled commands

本文概述了当笔记本无响应或取消命令时可执行的故障排除步骤。This article provides an overview of troubleshooting steps you can take if a notebook is unresponsive or cancels commands.

检查元存储连接性Check metastore connectivity

问题Problem

新附加的笔记本中的简单命令会失败,但之前已附加到相同群集的笔记本中的简单命令会成功。Simple commands in newly-attached notebooks fail, but succeed in notebooks that were attached to the same cluster earlier.

疑难解答步骤Troubleshooting steps

  1. 检查元存储连接性。Check metastore connectivity. 无法连接到 Hive 元存储库可能会导致 REPL 初始化挂起,使群集显示为无响应。The inability to connect to the Hive metastore can cause REPL initialization to hang, making the cluster appear unresponsive.
  2. 你使用的是 Azure Databricks 元存储还是你自己的外部元存储?Are you are using the Azure Databricks metastore or your own external metastore? 如果你使用的是外部元存储,最近是否进行了任何更改?If you are using an external metastore, have you changed anything recently? 是否升级了元存储版本?Did you upgrade your metastore version? 是否轮替了密码或配置?Rotate passwords or configurations? 是否更改了安全组规则?Change security group rules?

有关更多故障排除提示和解决方法,请参阅元存储:提示和故障排除See Metastore: tips and troubleshooting for more troubleshooting tips and solutions.

检查是否有冲突的库Check for conflicting libraries

问题Problem

Python 库冲突可能会导致命令被取消。Python library conflicts can result in cancelled commands. Azure Databricks 支持组织最常在 ipythonnumpyscipypandas 版本中看到冲突。The Azure Databricks support organization sees conflicts most often with versions of ipython, numpy, scipy, and pandas.

疑难解答步骤Troubleshooting steps

请参阅群集因库冲突而取消 Python 命令执行See Cluster cancels Python command execution due to library conflict.

有关笔记本故障排除的详细信息,请参阅笔记本:提示和故障排除For more notebook troubleshooting information, see Notebooks: tips and troubleshooting.