排查 Azure 应用服务中出现的 HTTP 错误“502 错误的网关”和“503 服务不可用”Troubleshoot HTTP errors of "502 bad gateway" and "503 service unavailable" in Azure App Service

Azure 应用服务中托管的应用经常出现“502 错误的网关”和“503 服务不可用”错误。"502 bad gateway" and "503 service unavailable" are common errors in your app hosted in Azure App Service. 本文帮助你排查这些错误。This article helps you troubleshoot these errors.

如果对本文中的任何内容需要更多帮助,可以联系 MSDN Azure 和堆栈溢出论坛上的 Azure 专家。If you need more help at any point in this article, you can contact the Azure experts on the MSDN Azure and the Stack Overflow forums. 或者,也可以提出 Azure 支持事件。Alternatively, you can also file an Azure support incident. 请转到 Azure 支持站点,并单击“获取支持”。Go to the Azure Support site and click on Get Support.

症状Symptom

浏览应用时返回 HTTP 错误“502 错误的网关”或 HTTP 错误“503 服务不可用”。When you browse to the app, it returns a HTTP "502 Bad Gateway" error or a HTTP "503 Service Unavailable" error.

原因Cause

此问题通常是应用程序级别的问题造成的,例如:This problem is often caused by application level issues, such as:

  • 请求耗费过长的时间requests taking a long time
  • 应用程序的内存/CPU 使用率过高application using high memory/CPU
  • 应用程序因异常而崩溃application crashing due to an exception.

解决“502 错误的网关”和“503 服务不可用”错误的故障排除步骤Troubleshooting steps to solve "502 bad gateway" and "503 service unavailable" errors

故障排除可划分为三种不同的任务,依次为:Troubleshooting can be divided into three distinct tasks, in sequential order:

  1. 观察和监视应用程序行为Observe and monitor application behavior
  2. 收集数据Collect data
  3. 缓解问题Mitigate the issue

应用服务在每个步骤提供了多种选项。App Service gives you various options at each step.

1.观察和监视应用程序行为1. Observe and monitor application behavior

跟踪服务运行状况Track Service health

每次发生服务中断或性能下降时 Azure 会进行宣传。Azure publicizes each time there is a service interruption or performance degradation. 可以在 Azure 门户中跟踪服务的运行状况。You can track the health of the service on the Azure Portal.

监视应用Monitor your app

此选项可让你找出应用程序是否存在任何问题。This option enables you to find out if your application is having any issues. 在应用的边栏选项卡中,单击“请求和错误”磁贴 。In your app's blade, click the Requests and errors tile. “指标”边栏选项卡显示所有可以添加的指标。 The Metric blade will show you all the metrics you can add.

可能想要在应用中监视的一些指标包括Some of the metrics that you might want to monitor for your app are

  • 平均内存工作集Average memory working set
  • 平均响应时间Average response time
  • CPU 时间CPU time
  • 内存工作集Memory working set
  • 请求Requests

监视应用以解决 HTTP 错误“502 错误的网关”和“503 服务不可用”

有关详细信息,请参阅:For more information, see:

2.收集数据2. Collect data

使用 Kudu 调试控制台Use the Kudu Debug Console

应用服务随附可用于调试、浏览和上传文件的调试控制台,以及用于获取环境相关信息的 JSON 终结点。App Service comes with a debug console that you can use for debugging, exploring, uploading files, as well as JSON endpoints for getting information about your environment. 此控制台称为应用的 Kudu 控制台SCM 仪表板This is called the Kudu Console or the SCM Dashboard for your app.

转到链接 https://<Your app name>.scm.chinacloudsites.cn/ 即可访问此仪表板。You can access this dashboard by going to the link https://<Your app name>.scm.chinacloudsites.cn/.

Kudu 提供的一些信息和功能包括:Some of the things that Kudu provides are:

  • 应用程序的环境设置environment settings for your application
  • 日志流log stream
  • 诊断转储diagnostic dump
  • 调试控制台,可以在其中运行 Powershell cmdlet 和基本 DOS 命令。debug console in which you can run Powershell cmdlets and basic DOS commands.

Kudu 的另一项有用功能是,如果应用程序引发第一次异常,可以使用 Kudu 和 SysInternals 工具 Procdump 创建内存转储。Another useful feature of Kudu is that, in case your application is throwing first-chance exceptions, you can use Kudu and the SysInternals tool Procdump to create memory dumps. 这些内存转储是进程的快照,通常可帮助排查较复杂的应用问题。These memory dumps are snapshots of the process and can often help you troubleshoot more complicated issues with your app.

有关 Kudu 提供的功能的详细信息,请参阅你应该了解的 Azure 网站联机工具For more information on features available in Kudu, see Azure Websites online tools you should know about.

3.缓解问题3. Mitigate the issue

缩放应用Scale the app

在 Azure 应用服务中,为了提高性能和吞吐量,可以调整运行应用程序的规模。In Azure App Service, for increased performance and throughput, you can adjust the scale at which you are running your application. 纵向扩展应用涉及到两个相关操作:将应用服务计划更改为较高的定价层,以及在切换到较高的定价层后配置特定的设置。Scaling up an app involves two related actions: changing your App Service plan to a higher pricing tier, and configuring certain settings after you have switched to the higher pricing tier.

有关缩放的详细信息,请参阅缩放 Azure 应用服务中的应用For more information on scaling, see Scale an app in Azure App Service.

此外,可以选择在多个实例上运行应用程序。Additionally, you can choose to run your application on more than one instance . 这不仅能提供更强大的处理能力,而且还能提供一定程度的容错。This not only provides you with more processing capability, but also gives you some amount of fault tolerance. 如果进程在某个实例上中断,其他实例仍将继续处理请求。If the process goes down on one instance, the other instance will still continue serving requests.

可以将缩放设置为手动或自动。You can set the scaling to be Manual or Automatic.

使用 AutoHealUse AutoHeal

AutoHeal 会根据所选设置(例如配置更改、请求、基于内存的限制或执行请求所需的时间)回收应用的工作进程。AutoHeal recycles the worker process for your app based on settings you choose (like configuration changes, requests, memory-based limits, or the time needed to execute a request). 在大多数情况下,回收进程是在出现问题后进行恢复的最快方式。Most of the time, recycle the process is the fastest way to recover from a problem. 尽管始终可以从 Azure 门户直接重新启动应用,但 AutoHeal 可以自动执行此操作。Though you can always restart the app from directly within the Azure Portal, AutoHeal will do it automatically for you. 只需在应用的根 web.config 中添加一些触发器即可。All you need to do is add some triggers in the root web.config for your app. 请注意,即使应用程序并非 .NET 应用程序,这些设置的工作方式也仍然相同。Note that these settings would work in the same way even if your application is not a .NET one.

有关详细信息,请参阅 自动修复 Azure 网站For more information, see Auto-Healing Azure Web Sites.

重新启动应用Restart the app

这通常是在发生一次性问题后进行恢复的最简单方式。This is often the simplest way to recover from one-time issues. Azure 门户的应用边栏选项卡中提供了用于停止或重启应用的选项。On the Azure Portal, on your app's blade, you have the options to stop or restart your app.

重新启动应用以解决 HTTP 错误“502 错误的网关”和“503 服务不可用”

还可以使用 Azure Powershell 管理应用。You can also manage your app using Azure Powershell. 有关详细信息,请参阅将 Azure PowerShell 与 Azure 资源管理器配合使用For more information, see Using Azure PowerShell with Azure Resource Manager.