Azure 服务总线故障排除指南Troubleshooting guide for Azure Service Bus

本文提供的故障排除技巧和建议适用于你在使用 Azure 服务总线时可能会遇到的一些问题。This article provides troubleshooting tips and recommendations for a few issues that you may see when using Azure Service Bus.

连接性、证书或超时问题Connectivity, certificate, or timeout issues

以下步骤可帮助排查 *.servicebus.chinacloudapi.cn 下所有服务的连接性/证书/超时问题。The following steps may help you with troubleshooting connectivity/certificate/timeout issues for all services under *.servicebus.chinacloudapi.cn.

  • 浏览至 https://<yournamespace>.servicebus.chinacloudapi.cn/ 或使用 wgetBrowse to or wget https://<yournamespace>.servicebus.chinacloudapi.cn/. 这可帮助检查是否存在 IP 筛选或虚拟网络或证书链问题(使用 Java SDK 时常见)。It helps with checking whether you have IP filtering or virtual network or certificate chain issues, which are common when using java SDK.

    成功消息的示例:An example of successful message:

    <feed xmlns="http://www.w3.org/2005/Atom"><title type="text">Publicly Listed Services</title><subtitle type="text">This is the list of publicly-listed services currently available.</subtitle><id>uuid:27fcd1e2-3a99-44b1-8f1e-3e92b52f0171;id=30</id><updated>2019-12-27T13:11:47Z</updated><generator>Service Bus 1.1</generator></feed>
    

    失败错误消息的示例:An example of failure error message:

    <Error>
        <Code>400</Code>
        <Detail>
            Bad Request. To know more visit https://aka.ms/sbResourceMgrExceptions. . TrackingId:b786d4d1-cbaf-47a8-a3d1-be689cda2a98_G22, SystemTracker:NoSystemTracker, Timestamp:2019-12-27T13:12:40
        </Detail>
    </Error>
    
  • 运行以下命令,检查防火墙是否阻止了任何端口。Run the following command to check if any port is blocked on the firewall. 所用的端口为 443 (HTTPS)、5671 (AMQP) 和 9354 (Net Messaging/SBMP)。Ports used are 443 (HTTPS), 5671 (AMQP) and 9354 (Net Messaging/SBMP). 根据使用的库,还会使用其他端口。Depending on the library you use, other ports are also used. 下面是用于检查 5671 端口是否被阻止的示例命令。Here is the sample command that check whether the 5671 port is blocked.

    tnc <yournamespacename>.servicebus.chinacloudapi.cn -port 5671
    

    在 Linux 上:On Linux:

    telnet <yournamespacename>.servicebus.chinacloudapi.cn 5671
    
  • 存在间歇性连接问题时,请运行以下命令,检查是否有丢弃的数据包。When there are intermittent connectivity issues, run the following command to check if there are any dropped packets. 此命令会尝试通过服务每隔 1 秒建立 25 个不同的 TCP 连接。This command will try to establish 25 different TCP connections every 1 second with the service. 然后,可以检查其中有多少成功/失败,还可以查看 TCP 连接延迟。Then, you can check how many of them succeeded/failed and also see TCP connection latency. 可以从此处下载 psping 工具。You can download the psping tool from here.

    .\psping.exe -n 25 -i 1 -q <yournamespace>.servicebus.chinacloudapi.cn:5671 -nobanner     
    

    如果使用的是其他工具(如 tncping 等),则可以使用等效的命令。You can use equivalent commands if you're using other tools such as tnc, ping, and so on.

  • 如果上述步骤没有帮助,请获取网络跟踪,并使用 Wireshark 之类的工具对其进行分析。Obtain a network trace if the previous steps don't help and analyze it using tools such as Wireshark. 如果需要,请联系 Azure 支持部门Contact Azure Support if needed.

服务升级/重启时可能出现的问题Issues that may occur with service upgrades/restarts

症状Symptoms

  • 可能会暂时限制请求。Requests may be momentarily throttled.
  • 传入的消息/请求可能会减少。There may be a drop in incoming messages/requests.
  • 日志文件可能包含错误消息。The log file may contain error messages.
  • 应用程序可能会在几秒内断开与服务的连接。The applications may be disconnected from the service for a few seconds.

原因Cause

后端服务升级和重启可能会在应用程序中导致这些问题。Backend service upgrades and restarts may cause these issues in your applications.

解决方法Resolution

如果应用程序代码使用 SDK,则重试策略已内置且处于活动状态。If the application code uses SDK, the retry policy is already built in and active. 应用程序会重新连接,此操作不会对应用程序/工作流产生重大影响。The application will reconnect without significant impact to the application/workflow.

未授权访问:需要发送声明Unauthorized access: Send claims are required

症状Symptoms

尝试使用具有发送权限的用户分配的托管标识从本地计算机上的 Visual Studio 访问服务总线主题时,可能会出现此错误。You may see this error when attempting to access a Service Bus topic from Visual Studio on an on-premises computer using a user-assigned managed identity with send permissions.

Service Bus Error: Unauthorized access. 'Send' claim\(s\) are required to perform this operation.

原因Cause

标识无权访问服务总线主题。The identity doesn't have permissions to access the Service Bus topic.

解决方法Resolution

要解决此错误,请安装 Microsoft.Azure.Services.AppAuthentication 库。To resolve this error, install the Microsoft.Azure.Services.AppAuthentication library. 有关详细信息,请参阅本地开发身份验证For more information, see Local development authentication.

要了解如何将权限分配给角色,请参阅使用 Azure Active Directory 对托管标识进行身份验证,以便访问 Azure 服务总线资源To learn how to assign permissions to roles, see Authenticate a managed identity with Azure Active Directory to access Azure Service Bus resources.

服务总线异常:put-token 失败Service Bus Exception: Put token failed

症状Symptoms

尝试使用同一服务总线连接发送超过 1000 条消息时,会收到以下错误消息:When you try to send more than 1000 messages using the same Service Bus connection, you'll receive the following error message:

Microsoft.Azure.ServiceBus.ServiceBusException: Put token failed. status-code: 403, status-description: The maximum number of '1000' tokens per connection has been reached.

原因Cause

用来通过单个到服务总线命名空间的连接发送和接收消息的令牌数量存在限制。There's a limit on number of tokens that are used to send and receive messages using a single connection to a Service Bus namespace. 该数量限制为 1000。It's 1000.

解决方法Resolution

打开到服务总线命名空间的新连接以发送更多消息。Open a new connection to the Service Bus namespace to send more messages.

后续步骤Next steps

请参阅以下文章:See the following articles:

  • Azure 资源管理器异常Azure Resource Manager exceptions. 这篇文章列出了使用 Azure 资源管理器(通过模板或直接调用)与 Azure 服务总线进行交互时生成的异常。It list exceptions generated when interacting with Azure Service Bus using Azure Resource Manager (via templates or direct calls).
  • 消息传送异常Messaging exceptions. 这篇文章列出了 Azure 服务总线的 .NET Framework 生成的异常。It provides a list of exceptions generated by .NET Framework for Azure Service Bus.