Diagnose connection issues for Azure Arc-enabled Kubernetes clusters
If you are experiencing issues connecting a cluster to Azure Arc, it's probably due to one of the issues listed here. We provide two flowcharts with guided help: one if you're not using a proxy server, and one that applies if your network connection uses a proxy server.
Tip
The steps in this flowchart apply whether you're using Azure CLI or Azure PowerShell to connect your cluster. However, some of the steps require the use of Azure CLI. If you haven't already installed Azure CLI, be sure to do so before you begin.
Connections without a proxy
Review this flowchart in order to diagnose your issue when attempting to connect a cluster to Azure Arc without a proxy server. More details about each step are provided below.
Does the Azure identity have sufficient permissions?
Review the prerequisites for connecting a cluster and make sure that the identity you're using to connect the cluster has the necessary permissions.
Are you running the latest version of Azure CLI?
Make sure you have the latest version installed.
If you connected your cluster by using Azure PowerShell, make sure you are running the latest version.
Is the connectedk8s
extension the latest version?
Update the Azure CLI connectedk8s
extension to the latest version by running this command:
az extension update --name connectedk8s
If you haven't installed the extension yet, you can do so by running the following command:
az extension add --name connectedk8s
Is kubeconfig pointing to the right cluster?
Run kubectl config get-contexts
to confirm the target context name. Then set the default context to the right cluster by running kubectl config use-context <target-cluster-name>
.
Are all required resource providers registered?
Be sure that the Microsoft.Kubernetes, Microsoft.KubernetesConfiguration, and Microsoft.ExtendedLocation resource providers are registered.
Are all network requirements met?
Review the network requirements and ensure that no required endpoints are blocked.
Are all pods in the azure-arc
namespace running?
If everything is working correctly, your pods should all be in the Running
state. Run kubectl get pods -n azure-arc
to confirm whether any pod's state is not Running
.
Still having problems?
The steps above will resolve many common connection issues, but if you're still unable to connect successfully, generate a troubleshooting log file and then open a support request so we can investigate the problem further.
To generate the troubleshooting log file, run the following command:
az connectedk8s troubleshoot -g <myResourceGroup> -n <myK8sCluster>
When you create your support request, in the Additional details section, use the File upload option to upload the generated log file.
Connections with a proxy server
If you are using a proxy server on at least one machine, complete the first five steps of the non-proxy flowchart (through resource provider registration) for basic troubleshooting steps. Then, if you are still encountering issues, review the next flowchart for additional troubleshooting steps. More details about each step are provided below.
Is the machine executing commands behind a proxy server?
If the machine is executing commands behind a proxy server, you'll need to set all of the necessary environment variables. For more information, see Connect using an outbound proxy server.
For example:
export HTTP_PROXY="http://<proxyIP>:<proxyPort>"
export HTTPS_PROXY="https://<proxyIP>:<proxyPort>"
export NO_PROXY="<cluster-apiserver-ip-address>:<proxyPort>"
Does the proxy server only accept trusted certificates?
Be sure to include the certificate file path by including --proxy-cert <path-to-cert-file>
when running the az connectedk8s connect
command.
az connectedk8s connect --name <cluster-name> --resource-group <resource-group> --proxy-cert <path-to-cert-file>
Is the proxy server able to reach required network endpoints?
Review the network requirements and ensure that no required endpoints are blocked.
Is the proxy server only using HTTP?
If your proxy server only uses HTTP, you can use proxy-http
for both parameters.
If your proxy server is set up with both HTTP and HTTPS, run the az connectedk8s connect
command with the --proxy-https
and --proxy-http
parameters specified. Be sure you are using --proxy-http
for the HTTP proxy and --proxy-https
for the HTTPS proxy.
az connectedk8s connect --name <cluster-name> --resource-group <resource-group> --proxy-https https://<proxy-server-ip-address>:<port> --proxy-http http://<proxy-server-ip-address>:<port>
Does the proxy server require skip ranges for service-to-service communication?
If you require skip ranges, use --proxy-skip-range <excludedIP>,<excludedCIDR>
in your az connectedk8s connect
command.
az connectedk8s connect --name <cluster-name> --resource-group <resource-group> --proxy-https https://<proxy-server-ip-address>:<port> --proxy-http http://<proxy-server-ip-address>:<port> --proxy-skip-range <excludedIP>,<excludedCIDR>
Are all pods in the azure-arc
namespace running?
If everything is working correctly, your pods should all be in the Running
state. Run kubectl get pods -n azure-arc
to confirm whether any pod's state is not Running
.
Check whether the DNS resolution is successful for the endpoint
From within the pod, you can run a DNS lookup to the endpoint.
What if you can't run the kubectl exec command to connect to the pod and install the DNS Utils package? In this situation, you can start a test pod in the same namespace as the problematic pod, and then run the tests.
Note
If the DNS resolution or egress traffic doesn't let you install the necessary network packages, you can use the rishasi/ubuntu-netutil:1.0
docker image. In this image, the required packages are already installed.
Here's an example procedure for checking DNS resolution:
Start a test pod in the same namespace as the problematic pod:
kubectl run -it --rm test-pod --namespace <namespace> --image=debian:stable
After the test pod is running, you'll gain access to the pod.
Run the following
apt-get
commands to install other tool packages:apt-get update -y apt-get install dnsutils -y apt-get install curl -y apt-get install netcat -y
After the packages are installed, run the nslookup command to test the DNS resolution to the endpoint:
$ nslookup microsoft.com Server: 10.0.0.10 Address: 10.0.0.10#53 ... ... Name: microsoft.com Address: 20.53.203.50
Try the DNS resolution from the upstream DNS server directly. This example uses Azure DNS:
$ nslookup microsoft.com 168.63.129.16 Server: 168.63.129.16 Address: 168.63.129.16#53 ... ... Address: 20.81.111.85
Run the
host
command to check whether the DNS requests are routed to the upstream server:$ host -a microsoft.com Trying "microsoft.com.default.svc.cluster.local" Trying "microsoft.com.svc.cluster.local" Trying "microsoft.com.cluster.local" Trying "microsoft.com.00idcnmrrm4edot5s2or1onxsc.bx.internal.chinacloudapp.cn" Trying "microsoft.com" Trying "microsoft.com" ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62884 ;; flags: qr rd ra; QUERY: 1, ANSWER: 27, AUTHORITY: 0, ADDITIONAL: 5 ;; QUESTION SECTION: ;microsoft.com. IN ANY ;; ANSWER SECTION: microsoft.com. 30 IN NS ns1-39.azure-dns.com. ... ... ns4-39.azure-dns.info. 30 IN A 13.107.206.39 Received 2121 bytes from 10.0.0.10#53 in 232 ms
Run a test pod in the Windows node pool:
# For a Windows environment, use the Resolve-DnsName cmdlet. kubectl run dnsutil-win --image='mcr.microsoft.com/windows/servercore:1809' --overrides='{"spec": { "nodeSelector": {"kubernetes.io/os": "windows"}}}' -- powershell "Start-Sleep -s 3600"
Run the kubectl exec command to connect to the pod by using PowerShell:
kubectl exec -it dnsutil-win powershell
Run the Resolve-DnsName cmdlet in PowerShell to check whether the DNS resolution is working for the endpoint:
PS C:\> Resolve-DnsName www.microsoft.com Name Type TTL Section NameHost ---- ---- --- ------- -------- www.microsoft.com CNAME 20 Answer www.microsoft.com-c-3.edgekey.net www.microsoft.com-c-3.edgekey. CNAME 20 Answer www.microsoft.com-c-3.edgekey.net.globalredir.akadns.net net www.microsoft.com-c-3.edgekey. CNAME 20 Answer e13678.dscb.akamaiedge.net net.globalredir.akadns.net Name : e13678.dscb.akamaiedge.net QueryType : AAAA TTL : 20 Section : Answer IP6Address : 2600:1408:c400:484::356e Name : e13678.dscb.akamaiedge.net QueryType : AAAA TTL : 20 Section : Answer IP6Address : 2600:1408:c400:496::356e Name : e13678.dscb.akamaiedge.net QueryType : A TTL : 12 Section : Answer IP4Address : 23.200.197.152
If the DNS resolution is not successful, verify the DNS configuration for the cluster.
Still having problems?
The steps above will resolve many common connection issues, but if you're still unable to connect successfully, generate a troubleshooting log file and then open a support request so we can investigate the problem further.
To generate the troubleshooting log file, run the following command:
az connectedk8s troubleshoot -g <myResourceGroup> -n <myK8sCluster>
When you create your support request, in the Additional details section, use the File upload option to upload the generated log file.
Next steps
- View more troubleshooting tips for using Azure Arc-enabled Kubernetes.
- Review the process to connect an existing Kubernetes cluster to Azure Arc.