Troubleshooting CoreDNS Latency and Loop Errors in Kubernetes

There are few situations more frustrating in a distributed system than intermittent network failures. You check the application logs, and everything looks fine. You check the ingress controller, and you see "502 Bad Gateway." You check the pods, and they are running.

Then you dig deeper. You find that your application pods are timing out while trying to resolve internal service names, or worse, your CoreDNS pods are stuck in CrashLoopBackOff with a cryptic log message: [FATAL] plugin/loop: Loop detected.

DNS is the circulatory system of Kubernetes. When it fails, the cluster doesn't die immediately—it degrades in agonizing, difficult-to-trace ways. This guide breaks down the root causes of CoreDNS loops and latency, and provides production-grade configurations to fix them.

The Root Cause: Why CoreDNS Breaks

To fix DNS, you must understand how Kubernetes handles name resolution. By default, Kubernetes deploys CoreDNS as a Deployment. When a Pod tries to reach my-service, it queries the CoreDNS Service IP.

The Loop Detection Mechanism

The CrashLoopBackOff state usually triggers when CoreDNS detects a forwarding loop. CoreDNS contains a loop plugin that sends a random query to itself. If it sees that query return to itself, it assumes a routing loop exists and shuts down to prevent consuming all CPU resources on the node.

This typically happens when:

CoreDNS is configured to forward unhandled queries to /etc/resolv.conf.
The Node's /etc/resolv.conf points to a local resolver (like systemd-resolved listening on 127.0.0.53) or the node acts as its own DNS resolver.
The local resolver forwards the query back to the cluster DNS, creating an infinite circle.

The Latency (The 5-Second Delay)

If your logs show DNS lookups taking exactly 5 seconds, you are likely a victim of the glibc resolver and the ndots configuration.

Kubernetes defaults ndots to 5. This means if you lookup google.com, the system appends local search domains first:

google.com.default.svc.cluster.local (Fail)
google.com.svc.cluster.local (Fail)
google.com.cluster.local (Fail)
...and so on.

This results in amplified packet traffic. If the underlying UDP packets are dropped (often due to kernel race conditions in conntrack), the resolver waits 5 seconds before retrying.

Solution 1: Fixing the CrashLoopBackOff (Loop Errors)

If your CoreDNS pods are crashing, we need to modify how CoreDNS handles upstream resolution. The default configuration usually forwards to the host's /etc/resolv.conf. We will bypass the host's local stub resolver.

Step 1: Identify the Upstream Resolver

Check the /etc/resolv.conf on one of your worker nodes:

cat /etc/resolv.conf
# Output often looks like:
# nameserver 127.0.0.53

If it points to a loopback address, CoreDNS will likely loop. You need to explicitly point CoreDNS to a real upstream DNS (like your cloud provider's DNS or Google/Cloudflare) or ensure the kubelet is configured to use a different resolv file.

Step 2: Patching the Corefile

We will modify the CoreDNS ConfigMap to explicitly forward non-cluster traffic to a stable upstream resolver, rather than relying on the potentially loopy /etc/resolv.conf.

Warning: This creates a dependency on external DNS. For air-gapped environments, point this to your corporate DNS IP.

apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
data:
  Corefile: |
    .:53 {
        errors
        health {
           lameduck 5s
        }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
           pods insecure
           fallthrough in-addr.arpa ip6.arpa
           ttl 30
        }
        prometheus :9153
        
        # CHANGED: explicitly forward to Cloudflare/Google instead of /etc/resolv.conf
        # to avoid local loopback issues on the node.
        forward . 1.1.1.1 8.8.8.8 {
           max_concurrent 1000
        }
        
        cache 30
        loop
        reload
        loadbalance
    }

Apply the change:

kubectl apply -f coredns-config.yaml
kubectl rollout restart deployment coredns -n kube-system

Solution 2: Fixing Latency and 502 Errors (ndots)

If your CoreDNS is running but resolution is slow, the issue is likely the ndots configuration in your application workloads, not CoreDNS itself.

The `dnsConfig` Patch

For high-performance applications that primarily connect to external endpoints (e.g., an API gateway connecting to S3 or an external database), you should lower ndots. This prevents the resolver from iterating through internal Kubernetes domains for external URLs.

Add the dnsConfig block to your Application Deployment spec:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: high-performance-api
spec:
  template:
    spec:
      containers:
      - name: api
        image: my-registry/api:latest
      # OPTIMIZATION START
      dnsPolicy: "ClusterFirst"
      dnsConfig:
        options:
          - name: ndots
            value: "2" # Reduces internal search queries significantly
          - name: single-request-reopen
            value: "" # Forces a new socket for A and AAAA lookups (fixes race conditions)
      # OPTIMIZATION END

Setting ndots: 2 means only domains with fewer than 2 dots will be qualified with search domains. google.com (1 dot) will still search locally, but api.google.com (2 dots) will go straight to the upstream resolver, skipping the internal search loop.

Solution 3: The Architectural Fix (NodeLocal DNSCache)

For large clusters (50+ nodes), the centralized CoreDNS Deployment becomes a bottleneck and a single point of failure for Conntrack table exhaustion. The definitive fix used in enterprise environments is NodeLocal DNSCache.

This deploys a DNS caching agent on every node as a DaemonSet. Pods query the agent on their own node, skipping the network hop to the central CoreDNS service.

Why this works

TCP Upgrade: It can upgrade UDP queries to TCP automatically, preventing packet drops common with UDP.
No Conntrack NAT: It listens on a link-local IP, bypassing the complex DNAT rules that often cause Linux kernel races.

Enabling NodeLocal DNSCache

While you can deploy the manifest manually, modern EKS/GKE versions have this as an add-on. If you are on vanilla Kubernetes (Kubeadm), enable it via the manifest provided in the Kubernetes repo.

Here is the critical configuration required in the node-local-dns ConfigMap to ensure it handles the "loop" correctly:

.:53 {
    errors
    cache 30
    reload
    loop
    bind 169.254.20.10 # Link-local IP
    forward . __PILLAR__CLUSTER__DNS__ { # Forwards to central CoreDNS
        force_tcp # CRITICAL: Forces TCP to upstream to reduce UDP packet drops
    }
    prometheus :9253
}

Deep Dive: The Linux Conntrack Race Condition

Why do we see random 5-second timeouts? It is often due to a kernel race condition in netfilter.

When a Pod sends A and AAAA (IPv4 and IPv6) DNS queries simultaneously over UDP via the same socket, the Linux kernel's Conntrack module must insert entries for the source network address translation (SNAT).

If the second packet arrives before the Conntrack entry for the first packet is confirmed, the kernel may drop the packet as "invalid state."

UDP Packet 1 sent.
UDP Packet 2 sent (microseconds later).
Packet 2 dropped by Conntrack.
Client waits 5 seconds (default DNS timeout).
Retry successful.

This is why the single-request-reopen option in Solution 2 is so effective; it forces the glibc resolver to use separate sockets for A and AAAA lookups, bypassing the race condition.

Common Pitfalls to Avoid

1. Removing the `loop` plugin

You might see advice to simply remove loop from the Corefile. Do not do this. The plugin protects your cluster. If you remove it, you mask the symptom, but your DNS queries will still infinitely cycle until they time out, consuming massive amounts of CPU on your nodes.

2. Ignoring Alpine Linux

Alpine Linux uses musl libc, not glibc. musl behaves differently regarding DNS. It processes nameservers in parallel and does not support the single-request-reopen option. If you are seeing DNS issues specifically in Alpine images, try switching to a Debian-slim base image to see if the issue persists.

Conclusion

DNS issues in Kubernetes are rarely about the DNS software itself; they are almost always about network topology, kernel packet processing, and configuration inheritance.

By explicitly defining your upstream resolvers to prevent loops and optimizing your application's dnsConfig to reduce ndots amplification, you eliminate the vast majority of latency and stability issues. For production clusters at scale, migrating to NodeLocal DNSCache is not just an optimization—it is a requirement for reliability.

Programming Tutorials

Search This Blog