Troubleshooting Networking Failures
Network issues in Kubernetes range from DNS resolution failures to service connectivity problems. A systematic debugging approach saves time.
DNS Debugging
DNS is the most common source of networking failures. CoreDNS resolves service names within the cluster:
# Run a temporary debug pod with DNS tools
kubectl run dns-debug --rm -it --image=busybox -- /bin/sh
# Inside the debug pod, test DNS resolution
nslookup kubernetes.default.svc.cluster.local
nslookup my-service.production.svc.cluster.local
# Check if CoreDNS pods are running
kubectl get pods -n kube-system -l k8s-app=kube-dns
# View CoreDNS logs for errors
kubectl logs -n kube-system -l k8s-app=kube-dns --tail=50
# Check CoreDNS configmap
kubectl get configmap coredns -n kube-system -o yaml
Service Connectivity
# Verify the service exists and has endpoints
kubectl get svc my-service -n production
kubectl get endpoints my-service -n production
# If endpoints are empty, check label selectors
kubectl get svc my-service -o jsonpath='{.spec.selector}'
kubectl get pods -l app=myapp -n production
# Test connectivity from within the cluster
kubectl run curl-test --rm -it --image=curlimages/curl -- \
curl -v http://my-service.production.svc:8080/health
Pod-to-Pod Communication
# Get the pod IP
kubectl get pod myapp -o jsonpath='{.status.podIP}'
# Test direct pod-to-pod connectivity
kubectl run net-test --rm -it --image=busybox -- \
wget -qO- http://10.244.1.15:8080
# Check if network policies are blocking traffic
kubectl get networkpolicies -n production
# Verify the CNI plugin is healthy
kubectl get pods -n kube-system | grep -E "calico|flannel|cilium"
Common Fixes
# Restart CoreDNS if DNS is broken
kubectl rollout restart deployment/coredns -n kube-system
# Check node-to-node connectivity
kubectl get nodes -o wide # Compare INTERNAL-IP addresses
# Verify kube-proxy is running
kubectl get pods -n kube-system -l k8s-app=kube-proxy
When debugging, work from the application layer down: DNS first, then service endpoints, then pod IPs, then node networking.