Sign In

Curriculum 29: Troubleshooting Playbook

Networking Failures & DNS

18 min · 35 XP

Troubleshooting Networking Failures

Network issues in Kubernetes range from DNS resolution failures to service connectivity problems. A systematic debugging approach saves time.

DNS Debugging

DNS is the most common source of networking failures. CoreDNS resolves service names within the cluster:

# Run a temporary debug pod with DNS tools
kubectl run dns-debug --rm -it --image=busybox -- /bin/sh

# Inside the debug pod, test DNS resolution
nslookup kubernetes.default.svc.cluster.local
nslookup my-service.production.svc.cluster.local

# Check if CoreDNS pods are running
kubectl get pods -n kube-system -l k8s-app=kube-dns

# View CoreDNS logs for errors
kubectl logs -n kube-system -l k8s-app=kube-dns --tail=50

# Check CoreDNS configmap
kubectl get configmap coredns -n kube-system -o yaml

Service Connectivity

# Verify the service exists and has endpoints
kubectl get svc my-service -n production
kubectl get endpoints my-service -n production

# If endpoints are empty, check label selectors
kubectl get svc my-service -o jsonpath='{.spec.selector}'
kubectl get pods -l app=myapp -n production

# Test connectivity from within the cluster
kubectl run curl-test --rm -it --image=curlimages/curl -- \
  curl -v http://my-service.production.svc:8080/health

Pod-to-Pod Communication

# Get the pod IP
kubectl get pod myapp -o jsonpath='{.status.podIP}'

# Test direct pod-to-pod connectivity
kubectl run net-test --rm -it --image=busybox -- \
  wget -qO- http://10.244.1.15:8080

# Check if network policies are blocking traffic
kubectl get networkpolicies -n production

# Verify the CNI plugin is healthy
kubectl get pods -n kube-system | grep -E "calico|flannel|cilium"

Common Fixes

# Restart CoreDNS if DNS is broken
kubectl rollout restart deployment/coredns -n kube-system

# Check node-to-node connectivity
kubectl get nodes -o wide  # Compare INTERNAL-IP addresses

# Verify kube-proxy is running
kubectl get pods -n kube-system -l k8s-app=kube-proxy

When debugging, work from the application layer down: DNS first, then service endpoints, then pod IPs, then node networking.