Troubleshoot connectivity issues - Azure Event Hubs

If your client application can't connect to an event hub, use this article to diagnose and resolve the issue. Connectivity problems fall into two categories: permanent issues (the connection never succeeds) and transient issues (intermittent failures).

For permanent issues, check these settings and other options mentioned in the Troubleshoot permanent connectivity issues section:

  • Connection string
  • Your organization's firewall settings
  • IP firewall settings
  • Network security settings (service endpoints, private endpoints, and more)

For transient issues, try the following options that can help with troubleshooting the issues. For more information, see Troubleshoot transient connectivity problems.

  • Upgrade to latest version of the SDK
  • Run commands to check dropped packets
  • Obtain network traces.

Troubleshoot permanent connectivity issues

If the application can't connect to the event hub at all, follow the steps in this section to troubleshoot the issue.

Check if there's a service outage

Check for the Azure Event Hubs service outage on the Azure service status site.

Verify the connection string

Verify that the connection string you're using is correct. See Get connection string to get the connection string by using the Azure portal, CLI, or PowerShell.

For Kafka clients, verify that producer.config or consumer.config files are configured properly. For more information, see Send and receive messages with Kafka in Event Hubs.

What protocols I can use to send and receive events?

Producers or senders can use Advanced Messaging Queuing Protocol (AMQP), Kafka, or HTTPS protocols to send events to an event hub.

Consumers or receivers use AMQP or Kafka to receive events from an event hub. Event Hubs supports only the pull model for consumers to receive events from it. Even when you use event handlers to handle events from an event hub, the event processor internally uses the pull model to receive events from the event hub.

AMQP

You can use the AMQP 1.0 protocol to send events to and receive events from Azure Event Hubs. AMQP provides reliable, performant, and secure communication for both sending and receiving events. You can use it for high-performance and real-time streaming and is supported by most Azure Event Hubs SDKs.

HTTPS/REST API

You can only send events to Event Hubs using HTTP POST requests. Event Hubs doesn't support receiving events over HTTPS. It's suitable for lightweight clients where a direct TCP connection isn't feasible.

Apache Kafka

Azure Event Hubs has a built-in Kafka endpoint that supports Kafka producers and consumers. Applications that are built using Kafka can use Kafka protocol (version 1.0 or later) to send and receive events from Event Hubs without any code changes.

Azure SDKs abstract the underlying communication protocols and provide a simplified way to send and receive events from Event Hubs using languages like C#, Java, Python, JavaScript, etc.

What ports do I need to open on the firewall?

You can use the following protocols with Azure Event Hubs to send and receive events:

  • Advanced Message Queuing Protocol 1.0 (AMQP)
  • Hypertext Transfer Protocol 1.1 with Transport Layer Security (HTTPS)
  • Apache Kafka

See the following table for the outbound ports you need to open to use these protocols to communicate with Azure Event Hubs.

Protocol Ports Details
AMQP 5671 and 5672 See AMQP protocol guide
HTTPS 443 This port is used for the HTTP/REST API and for AMQP-over-WebSockets.
Kafka 9093 See Use Event Hubs from Kafka applications

The HTTPS port is required for outbound communication also when AMQP is used over port 5671, because several management operations performed by the client SDKs and the acquisition of tokens from Microsoft Entra ID (when used) run over HTTPS.

The official Azure SDKs generally use the AMQP protocol for sending and receiving events from Event Hubs. The AMQP-over-WebSockets protocol option runs over port TCP 443 just like the HTTP API, but is otherwise functionally identical with plain AMQP. This option has higher initial connection latency because of extra handshake round trips and slightly more overhead as tradeoff for sharing the HTTPS port. If this mode is selected, TCP port 443 is sufficient for communication. The following options allow selecting the plain AMQP or AMQP WebSockets mode:

Language Option
.NET EventHubConnectionOptions.TransportType property with EventHubsTransportType.AmqpTcp or EventHubsTransportType.AmqpWebSockets
Java com.microsoft.azure.eventhubs.EventProcessorClientBuilder.transporttype with AmqpTransportType.AMQP or AmqpTransportType.AMQP_WEB_SOCKETS
Node EventHubConsumerClientOptions has a webSocketOptions property.
Python EventHubConsumerClient.transport_type with TransportType.Amqp or TransportType.AmqpOverWebSocket

What IP addresses do I need to allow?

When you're working with Azure, sometimes you have to allow specific IP address ranges or URLs in your corporate firewall or proxy to access all Azure services you're using or trying to use. Verify that the traffic is allowed on IP addresses used by Event Hubs. For IP addresses used by Azure Event Hubs: see Azure IP Ranges and Service Tags - Public Cloud.

Also, verify that the IP address for your namespace is allowed. To find the right IP addresses to allow for your connections, follow these steps:

  1. Run the following command from a command prompt:

    nslookup <YourNamespaceName>.servicebus.chinacloudapi.cn
    
  2. Note down the IP address returned in Non-authoritative answer.

If you use the zone redundancy for your namespace, you need to do a few extra steps:

  1. First, you run nslookup on the namespace.

    nslookup <yournamespace>.servicebus.chinacloudapi.cn
    
  2. Note down the name in the non-authoritative answer section, which is in one of the following formats:

    <name>-s1.chinacloudapp.cn
    <name>-s2.chinacloudapp.cn
    <name>-s3.chinacloudapp.cn
    
  3. Run nslookup for each one with suffixes s1, s2, and s3 to get the IP addresses of all three instances running in three availability zones,

    Note

    The IP address returned by the nslookup command isn't a static IP address. However, it remains constant until the underlying deployment is deleted or moved to a different cluster.

What client IPs are sending events to or receiving events from my namespace?

First, enable IP filtering on the namespace.

Then, Enable diagnostic logs for Event Hubs virtual network connection events by following instructions in the Enable diagnostic logs. You see the IP address for which connection is denied.

{
    "SubscriptionId": "0000000-0000-0000-0000-000000000000",
    "NamespaceName": "namespace-name",
    "IPAddress": "1.2.3.4",
    "Action": "Deny Connection",
    "Reason": "IPAddress doesn't belong to a subnet with Service Endpoint enabled.",
    "Count": "65",
    "ResourceId": "/subscriptions/0000000-0000-0000-0000-000000000000/resourcegroups/testrg/providers/microsoft.eventhub/namespaces/namespace-name",
    "Category": "EventHubVNetConnectionEvent"
}

Important

Virtual network logs are generated only if the namespace allows access from specific IP addresses (IP filter rules). If you don't want to restrict access to your namespace using these features and still want to get virtual network logs to track IP addresses of clients connecting to the Event Hubs namespace, you could use the following workaround: Enable IP filtering, and add the total addressable IPv4 range (0.0.0.0/1 - 128.0.0.0/1) and IPv6 range (::/1 - 8000::/1).

Note

Currently, it's not possible to determine the source IP of an individual message or event.

Verify that Event Hubs service tag is allowed in your network security groups

If your application runs inside a subnet and there's an associated network security group, confirm whether the internet outbound traffic is allowed or Event Hubs service tag (EventHub) is allowed. See Virtual network service tags and search for EventHub.

Check if the application needs to be running in a specific subnet of a virtual network

Confirm that your application runs in a virtual network subnet that has access to the namespace. If it doesn't, run the application in the subnet that has access to the namespace or add the IP address of the machine on which application is running to the IP firewall.

When you create a virtual network service endpoint for an event hub namespace, the namespace accepts traffic only from the subnet that's bound to the service endpoint. There's an exception to this behavior. You can add specific IP addresses in the IP firewall to enable access to the event hub's public endpoint. For more information, see Network service endpoints.

Check the IP firewall settings for your namespace

Check that the public IP address of the machine on which the application is running isn't blocked by the IP firewall.

By default, Event Hubs namespaces are accessible from internet as long as the request comes with valid authentication and authorization. With IP firewall, you can restrict it further to only a set of IPv4 addresses or IPv4 address ranges in CIDR (Classless Inter-Domain Routing) notation.

The IP firewall rules are applied at the Event Hubs namespace level. Therefore, the rules apply to all connections from clients using any supported protocol. Any connection attempt from an IP address that doesn't match an allowed IP rule on the Event Hubs namespace is rejected as unauthorized. The response doesn't mention the IP rule. IP filter rules are applied in order, and the first rule that matches the IP address determines the accept or reject action.

For more information, see Configure IP firewall rules for an Azure Event Hubs namespace. To check whether you have IP filtering, virtual network, or certificate chain issues, see Troubleshoot network related issues.

To troubleshoot network-related problems with Event Hubs, follow these steps:

Browse to or use wget to access https://<yournamespacename>.servicebus.chinacloudapi.cn/. This step helps you check whether you have IP filtering, virtual network, or certificate chain problems (most common when using Java SDK).

An example of successful message:

<feed xmlns="http://www.w3.org/2005/Atom"><title type="text">Publicly Listed Services</title><subtitle type="text">This is the list of publicly-listed services currently available.</subtitle><id>uuid:27fcd1e2-3a99-44b1-8f1e-3e92b52f0171;id=30</id><updated>2019-12-27T13:11:47Z</updated><generator>Service Bus 1.1</generator></feed>

An example of failure error message:

<Error>
    <Code>400</Code>
    <Detail>
        Bad Request. To know more visit https://aka.ms/sbResourceMgrExceptions. . TrackingId:b786d4d1-cbaf-47a8-a3d1-be689cda2a98_G22, SystemTracker:NoSystemTracker, Timestamp:2019-12-27T13:12:40
    </Detail>
</Error>

Troubleshoot transient connectivity problems

If you're experiencing intermittent connectivity problems, go through the following sections for troubleshooting tips.

Use the latest version of the client SDK

Later versions of the SDK might fix some transient connectivity problems. Make sure you're using the latest version of client SDKs in your applications. SDKs are continuously improved with new or updated features and bug fixes, so always test with the latest package. Check the release notes for fixed problems and added or updated features.

For information about client SDKs, see the Azure Event Hubs - Client SDKs article.

Run the command to check dropped packets

When there are intermittent connectivity issues, run the following command to check if there are any dropped packets. This command tries to establish 25 different TCP connections every 1 second with the service. Then, you can check how many of them succeeded/failed and also see TCP connection latency. You can download the psping tool from here.

.\psping.exe -n 25 -i 1 -q <yournamespacename>.servicebus.chinacloudapi.cn:5671 -nobanner     

You can use equivalent commands if you're using other tools such as tnc, ping, and so on.

Obtain a network trace if the previous steps don't help and analyze it using tools such as Wireshark. Contact Azure Support if needed.

Service upgrades and restarts

Transient connectivity problems can happen because of backend service upgrades and restarts. When these problems happen, you might see the following symptoms:

  • Incoming messages or requests drop.
  • The log file shows error messages.
  • The applications disconnect from the service for a few seconds.
  • Requests are momentarily throttled.

If your application code uses the SDK, it already has an active retry policy. The application reconnects without significant impact to the application or workflow. By catching these transient errors, backing off, and then retrying the call, you make sure your code can handle these transient problems.

Next steps

See the following articles: