排除 IoT Edge 设备故障Troubleshoot your IoT Edge device

如果在环境中运行 Azure IoT Edge 时遇到问题,请将本文作为指南进行故障排除和诊断。If you experience issues running Azure IoT Edge in your environment, use this article as a guide for troubleshooting and diagnostics.

运行“check”命令Run the 'check' command

排查 IoT Edge 问题时,第一步应该是使用 check 命令,针对常见问题运行一系列配置和连接性测试。Your first step when troubleshooting IoT Edge should be to use the check command, which runs a collection of configuration and connectivity tests for common issues. check 命令在版本 1.0.7 及更高版本中提供。The check command is available in release 1.0.7 and later.

备注

如果 IoT Edge 设备位于代理服务器后面,则故障排除工具无法运行连接性检查。The troubleshooting tool can't run connectivity checks if the IoT Edge device is behind a proxy server.

可以运行 check 命令(如下所示),也可以包括 --help 标志,以便查看选项的完整列表:You can run the check command as follows, or include the --help flag to see a complete list of options:

在 Linux 上:On Linux:

sudo iotedge check

在 Windows 上:On Windows:

iotedge check

故障排除工具将运行多个检查,这些检查分为以下三个类别:The troubleshooting tool runs many checks that are sorted into these three categories:

  • “配置检查”将检查妨碍 IoT Edge 设备连接到云的详细情况,包括 config.yaml 和容器引擎出现的问题。 Configuration checks examines details that could prevent IoT Edge devices from connecting to the cloud, including issues with config.yaml and the container engine.
  • “连接性检查”将验证 IoT Edge 运行时能否访问主机设备上的端口,以及所有 IoT Edge 组件能否连接到 IoT 中心。Connection checks verify that the IoT Edge runtime can access ports on the host device and that all the IoT Edge components can connect to the IoT Hub. 如果 IoT Edge 设备位于代理后面,则这组检查将返回错误。This set of checks returns errors if the IoT Edge device is behind a proxy.
  • “生产准备情况检查”将寻找建议的生产最佳做法,例如设备证书颁发机构 (CA) 颁发证书的状态以及模块日志文件配置。Production readiness checks look for recommended production best practices, such as the state of device certificate authority (CA) certificates and module log file configuration.

有关此工具运行的每个诊断检查的信息,包括可在出现错误或警告时执行的操作,请参阅 IoT Edge 排除故障检查For information about each of the diagnostic checks this tool runs, including what to do if you get an error or warning, see IoT Edge troubleshoot checks.

通过“support-bundle”命令收集调试信息Gather debug information with 'support-bundle' command

需要从 IoT Edge 设备收集日志时,最方便的方法是使用 support-bundle 命令。When you need to gather logs from an IoT Edge device, the most convenient way is to use the support-bundle command. 默认情况下,此命令收集模块、IoT Edge 安全管理器和容器引擎日志、iotedge check JSON 输出和其他有用的调试信息。By default, this command collects module, IoT Edge security manager and container engine logs, iotedge check JSON output, and other useful debug information. 它将它们压缩成单个文件,便于共享。It compresses them into a single file for easy sharing. support-bundle 命令在版本 1.0.9 及更高版本中提供。The support-bundle command is available in release 1.0.9 and later.

运行带 --since 标志的 support-bundle 命令,指定要从过去获取日志的时间。Run the support-bundle command with the --since flag to specify how long from the past you want to get logs. 例如,6h 会获取过去 6 小时的日志,6d 会获取过去 6 天的日志,6m 会获取过去 6 分钟的日志,依此类推。For example 6h will get logs since the last six hours, 6d since the last six days, 6m since the last six minutes and so on. 包括 --help 标志即可查看选项的完整列表。Include the --help flag to see a complete list of options.

在 Linux 上:On Linux:

sudo iotedge support-bundle --since 6h

在 Windows 上:On Windows:

iotedge support-bundle --since 6h

警告

support-bundle 命令的输出可能包含主机、设备和模块名称、模块记录的信息,等等。如果在公共论坛中共享输出,请注意这一点。Output from the support-bundle command can contain host, device and module names, information logged by your modules etc. Please be aware of this if sharing the output in a public forum.

检查 IoT Edge 版本Check your IoT Edge version

如果运行的是较旧版本的 IoT Edge,则升级可能会解决你的问题。If you're running an older version of IoT Edge, then upgrading may resolve your issue. iotedge check 工具将检查 IoT Edge 安全守护程序是否是最新版本,但不会检查 IoT Edge 集中心和代理模块的版本。The iotedge check tool checks that the IoT Edge security daemon is the latest version, but does not check the versions of the IoT Edge hub and agent modules. 若要检查设备上的运行时模块的版本,请使用 iotedge logs edgeAgentiotedge logs edgeHub 命令。To check the version of the runtime modules on your device, use the commands iotedge logs edgeAgent and iotedge logs edgeHub. 模块启动时,版本号会在日志中显示。The version number is declared in the logs when the module starts up.

有关如何升级设备的说明,请参阅更新 IoT Edge 安全守护程序和运行时For instructions on how to update your device, see Update the IoT Edge security daemon and runtime.

检查 IoT Edge 安全管理器的状态及其日志Check the status of the IoT Edge security manager and its logs

IoT Edge 安全管理器 负责在启动时初始化 IoT Edge 系统和预配设备等操作。The IoT Edge security manager is responsible for operations like initializing the IoT Edge system at startup and provisioning devices. 如果 IoT Edge 未启动,则安全管理器日志可能会提供有用的信息。If IoT Edge isn't starting, the security manager logs may provide useful information.

在 Linux 上:On Linux:

  • 查看 IoT Edge 安全管理器的状态:View the status of the IoT Edge security manager:

    sudo systemctl status iotedge
    
  • 查看 IoT Edge 安全管理器的日志:View the logs of the IoT Edge security manager:

    sudo journalctl -u iotedge -f
    
  • 查看 IoT Edge 安全管理器的更详细日志:View more detailed logs of the IoT Edge security manager:

    • 编辑 IoT Edge 守护程序设置:Edit the IoT Edge daemon settings:

      sudo systemctl edit iotedge.service
      
    • 更新以下行:Update the following lines:

      [Service]
      Environment=IOTEDGE_LOG=edgelet=debug
      
    • 重启 IoT Edge 安全守护程序:Restart the IoT Edge Security Daemon:

      sudo systemctl cat iotedge.service
      sudo systemctl daemon-reload
      sudo systemctl restart iotedge
      

在 Windows 上:On Windows:

  • 查看 IoT Edge 安全管理器的状态:View the status of the IoT Edge security manager:

    Get-Service iotedge
    
  • 查看 IoT Edge 安全管理器的日志:View the logs of the IoT Edge security manager:

    . {Invoke-WebRequest -useb aka.ms/iotedge-win} | Invoke-Expression; Get-IoTEdgeLog
    
  • 仅查看最后 5 分钟的 IoT Edge 安全管理器日志:View only the last 5 minutes of the IoT Edge security manager logs:

    . {Invoke-WebRequest -useb aka.ms/iotedge-win} | Invoke-Expression; Get-IoTEdgeLog -StartTime ([datetime]::Now.AddMinutes(-5))
    
  • 查看 IoT Edge 安全管理器的更详细日志:View more detailed logs of the IoT Edge security manager:

    • 添加系统级环境变量:Add a system-level environment variable:

      [Environment]::SetEnvironmentVariable("IOTEDGE_LOG", "debug", [EnvironmentVariableTarget]::Machine)
      
    • 重启 IoT Edge 安全守护程序:Restart the IoT Edge Security Daemon:

      Restart-Service iotedge
      

如果 IoT Edge 安全管理器未运行,请验证 yaml 配置文件If the IoT Edge security manager is not running, verify your yaml configuration file

警告

YAML 文件不能包含制表符作为缩进。YAML files cannot contain tabs as indentation. 请改用 2 个空格。Use 2 spaces instead. 顶级元素应该没有前导空格。Top-level elements should have no leading spaces.

在 Linux 上:On Linux:

sudo nano /etc/iotedge/config.yaml

在 Windows 上:On Windows:

notepad C:\ProgramData\iotedge\config.yaml

重启 IoT Edge 安全管理器Restart the IoT Edge security manager

如果问题仍然存在,可以尝试重启 IoT Edge 安全管理器。If issue is still persisting, you can try restarting the IoT Edge security manager.

在 Linux 上:On Linux:

sudo systemctl restart iotedge

在 Windows 上:On Windows:

Stop-Service iotedge -NoWait
sleep 5
Start-Service iotedge

检查容器日志是否有问题Check container logs for issues

IoT Edge 安全守护程序运行后,请查看容器日志以检测问题。Once the IoT Edge security daemon is running, look at the logs of the containers to detect issues. 先查看你的已部署容器,然后查看构成 IoT Edge 运行时的容器:edgeAgent 和 edgeHub。Start with your deployed containers, then look at the containers that make up the IoT Edge runtime: edgeAgent and edgeHub. IoT Edge 代理日志通常提供有关每个容器的生命周期的信息。The IoT Edge agent logs typically provide info on the lifecycle of each container. IoT Edge 中心日志提供有关消息传送和路由的信息。The IoT Edge hub logs provide info on messaging and routing.

iotedge logs <container name>

查看通过 IoT Edge 中心的消息View the messages going through the IoT Edge hub

查看通过 IoT Edge 中心的消息,并通过来自运行时容器的详细日志收集见解。You can view the messages going through the IoT Edge hub, and gather insights from verbose logs from the runtime containers. 若要在这些容器上启用详细日志,请在 yaml 配置文件中设置 RuntimeLogLevelTo turn on verbose logs on these containers, set RuntimeLogLevel in your yaml configuration file. 若要打开该文件,请执行以下操作:To open the file:

在 Linux 上:On Linux:

sudo nano /etc/iotedge/config.yaml

在 Windows 上:On Windows:

notepad C:\ProgramData\iotedge\config.yaml

默认情况下,agent 元素将类似于以下示例:By default, the agent element will look like the following example:

agent:
  name: edgeAgent
  type: docker
  env: {}
  config:
    image: mcr.microsoft.com/azureiotedge-agent:1.0
    auth: {}

env: {} 替换为:Replace env: {} with:

env:
  RuntimeLogLevel: debug

警告

YAML 文件不能包含制表符作为缩进。YAML files cannot contain tabs as identation. 请改用 2 个空格。Use 2 spaces instead. 顶级项不能有前导空格。Top-level items cannot have leading whitespace.

保存该文件并重启 IoT Edge 安全管理器。Save the file and restart the IoT Edge security manager.

还可以检查在 IoT 中心与 IoT Edge 设备之间发送的消息。You can also check the messages being sent between IoT Hub and the IoT Edge devices. 使用适用于 Visual Studio Code 的 Azure IoT 中心扩展查看这些消息。View these messages by using the Azure IoT Hub extension for Visual Studio Code. 有关详细信息,请参阅 Handy tool when you develop with Azure IoT(通过 Azure IoT 进行开发时的顺手工具)。For more information, see Handy tool when you develop with Azure IoT.

重启容器Restart containers

在为了解信息而调查日志和消息后,可以尝试重启容器:After investigating the logs and messages for information, you can try restarting containers:

iotedge restart <container name>

重启 IoT Edge 运行时容器:Restart the IoT Edge runtime containers:

iotedge restart edgeAgent && iotedge restart edgeHub

检查防火墙和端口配置规则Check your firewall and port configuration rules

Azure IoT Edge 允许使用支持的 IoT 中心协议从本地服务器来与 Azure 云通信,具体请参阅选择通信协议Azure IoT Edge allows communication from an on-premises server to Azure cloud using supported IoT Hub protocols, see choosing a communication protocol. 为了增强安全性,Azure IoT Edge 与 Azure IoT 中心之间的信道始终配置为出站。For enhanced security, communication channels between Azure IoT Edge and Azure IoT Hub are always configured to be Outbound. 此配置基于服务辅助通信模式,可最大限度地减少恶意实体可探知的攻击面。This configuration is based on the Services Assisted Communication pattern, which minimizes the attack surface for a malicious entity to explore. 入站通信仅在特定情况下需要,其中 Azure IoT 中心需要将消息推送到 Azure IoT Edge 设备。Inbound communication is only required for specific scenarios where Azure IoT Hub needs to push messages to the Azure IoT Edge device. 使用安全的 TLS 通道来保护云到设备的消息,并且可以使用 X.509 证书和 TPM 设备模块来增强其保护。Cloud-to-device messages are protected using secure TLS channels and can be further secured using X.509 certificates and TPM device modules. Azure IoT Edge 安全管理器控制这种通信的建立方式,具体请参阅 IoT Edge 安全管理器The Azure IoT Edge Security Manager governs how this communication can be established, see IoT Edge Security Manager.

IoT Edge 提供增强的配置来保护 Azure IoT Edge 运行时和已部署的模块,但仍依赖于底层计算机和网络配置。While IoT Edge provides enhanced configuration for securing Azure IoT Edge runtime and deployed modules, it is still dependent on the underlying machine and network configuration. 因此,必须确保设置适当的网络和防火墙规则来保护 Edge 与云之间的通信。Hence, it is imperative to ensure proper network and firewall rules are set up for secure edge to cloud communication. 为托管 Azure IoT Edge 运行时的底层服务器配置防火墙规则时,可参考下表中的指导:The following table can be used as a guideline when configuration firewall rules for the underlying servers where Azure IoT Edge runtime is hosted:

协议Protocol 端口Port 传入Incoming 传出Outgoing 指南Guidance
MQTTMQTT 88838883 阻止(默认)BLOCKED (Default) 阻止(默认)BLOCKED (Default)
  • 使用 MQTT 作为通信协议时,请将传出(出站)端口配置为“打开”。Configure Outgoing (Outbound) to be Open when using MQTT as the communication protocol.
  • IoT Edge 不支持将端口 1883 用于 MQTT。1883 for MQTT is not supported by IoT Edge.
  • 应阻止传入(入站)连接。Incoming (Inbound) connections should be blocked.
AMQPAMQP 56715671 阻止(默认)BLOCKED (Default) 打开(默认)OPEN (Default)
  • IoT Edge 的默认通信协议。Default communication protocol for IoT Edge.
  • 如果未为其他支持的协议配置 Azure IoT Edge,或者 AMQP 是所需的通信协议,则必须将此端口配置为“打开”。Must be configured to be Open if Azure IoT Edge is not configured for other supported protocols or AMQP is the desired communication protocol.
  • IoT Edge 不支持将端口 5672 用于 AMQP。5672 for AMQP is not supported by IoT Edge.
  • 当 Azure IoT Edge 使用不同的受 IoT 中心支持的协议时,请阻止此端口。Block this port when Azure IoT Edge uses a different IoT Hub supported protocol.
  • 应阻止传入(入站)连接。Incoming (Inbound) connections should be blocked.
HTTPSHTTPS 443443 阻止(默认)BLOCKED (Default) 打开(默认)OPEN (Default)
  • 将传出(出站)配置为在 443 上打开以进行 IoT Edge 预配。Configure Outgoing (Outbound) to be Open on 443 for IoT Edge provisioning. 使用手动脚本或 Azure IoT 设备预配服务 (DPS) 时,此配置是必需的。This configuration is required when using manual scripts or Azure IoT Device Provisioning Service (DPS).
  • 只应针对特定的方案打开传入(入站)连接:Incoming (Inbound) connection should be Open only for specific scenarios:
    • 如果透明网关中的叶设备可能发送方法请求。If you have a transparent gateway with leaf devices that may send method requests. 在这种情况下,无需向外部网络打开端口 443,即可连接到 IoT 中心或通过 Azure IoT Edge 提供 IoT 中心服务。In this case, Port 443 does not need to be open to external networks to connect to IoTHub or provide IoTHub services through Azure IoT Edge. 因此,传入规则可限制为只能从内部网络打开传入(入站)连接。Thus the incoming rule could be restricted to only open Incoming (Inbound) from the internal network.
    • 适用于客户端到设备 (C2D) 的方案。For Client to Device (C2D) scenarios.
  • IoT Edge 不支持将端口 80 用于 HTTP。80 for HTTP is not supported by IoT Edge.
  • 如果无法在企业中配置非 HTTP 协议(例如 AMQP 或 MQTT),消息可通过 WebSocket 发送。If non-HTTP protocols (for example, AMQP or MQTT) cannot be configured in the enterprise; the messages can be sent over WebSockets. 将这种情况下,将使用端口 443 进行 WebSocket 通信。Port 443 will be used for WebSocket communication in that case.

后续步骤Next steps

认为在 IoT Edge 平台中发现了 bug?Do you think that you found a bug in the IoT Edge platform? 提交问题,以便我们可以持续改进。Submit an issue so that we can continue to improve.

如果你还有其他问题,请创建支持请求以获取帮助。If you have more questions, create a Support request for help.