Kubernetes networking confuses everyone at first. Pods, Services, Ingresses, CNIs — it’s a lot. Here’s how it actually works, layer by layer.
The Fundamental Model
Kubernetes networking has three simple rules:
- Every Pod gets its own IP address
- Pods can communicate with any other Pod without NAT
- Agents on a node can communicate with all Pods on that node
That’s it. Everything else is implementation detail.
Layer 1: Pod Networking
Each Pod gets an IP from the cluster’s Pod CIDR (e.g., 10.244.0.0/16). This isn’t magic — your CNI plugin handles it.
How CNI Works
When a Pod is created:
- Kubelet calls the CNI plugin
- CNI creates a virtual ethernet pair (veth)
- One end goes in the Pod’s network namespace
- Other end connects to a bridge or overlay network
- CNI assigns an IP and configures routes
| |
Common CNI Plugins
Calico: Uses BGP for routing. No overlay, pure L3. Great performance.
Flannel: Simple overlay with VXLAN. Easy to set up, slightly more overhead.
Cilium: eBPF-based. Advanced features like network policies, observability. Modern choice.
| |
Layer 2: Service Discovery
Pods are ephemeral. Their IPs change. Services provide stable endpoints.
ClusterIP (Default)
Creates a virtual IP that load-balances to matching Pods:
| |
This creates:
- A ClusterIP (e.g.,
10.96.45.12) - DNS entry (
my-app.default.svc.cluster.local) - iptables/IPVS rules for load balancing
| |
How kube-proxy Works
kube-proxy watches Services and Endpoints, then programs:
iptables mode: NAT rules that DNAT traffic to Pod IPs. Simple but doesn’t scale well (O(n) rule matching).
IPVS mode: Uses Linux IPVS for L4 load balancing. Better performance at scale, supports more algorithms (round-robin, least connections, etc.).
eBPF mode (Cilium): Replaces kube-proxy entirely. Fastest option, direct Pod-to-Pod without NAT.
NodePort
Exposes Service on every node’s IP at a static port:
| |
Traffic flow: NodeIP:30080 → kube-proxy → Pod
LoadBalancer
Cloud provider creates an external load balancer:
| |
Cloud controller provisions an ELB/ALB/NLB that points to NodePorts.
Layer 3: Ingress
Services handle L4 (TCP/UDP). Ingress handles L7 (HTTP/HTTPS).
Basic Ingress
| |
How It Works
- Ingress Controller (nginx, traefik, etc.) watches Ingress resources
- Generates config (nginx.conf) based on rules
- Routes traffic based on Host header and path
- Terminates TLS if configured
| |
Gateway API (The Future)
Ingress is being superseded by Gateway API. More expressive, role-oriented:
| |
Network Policies
By default, all Pods can talk to all other Pods. Network Policies restrict this:
| |
This says: api Pods can only receive traffic from frontend Pods on 8080, and can only send traffic to database Pods on 5432.
Important: Your CNI must support NetworkPolicy. Flannel doesn’t. Calico and Cilium do.
DNS
CoreDNS provides cluster DNS. Every Pod gets a DNS config pointing to it:
| |
DNS records created:
service-name.namespace.svc.cluster.local→ ClusterIPpod-ip.namespace.pod.cluster.local→ Pod IP- Headless Services → A records for each Pod
| |
Useful for StatefulSets where you need stable DNS per Pod.
Debugging Network Issues
Pod Can’t Reach Service
| |
Pod Can’t Reach External
| |
Service Timeouts
| |
Performance Tuning
Use IPVS Instead of iptables
| |
Consider Cilium with eBPF
Replaces kube-proxy, skips iptables entirely:
| |
Optimize CoreDNS
| |
Key Takeaways
- CNI handles Pod IPs — choose based on features needed (Calico for NetworkPolicy, Cilium for eBPF)
- Services provide stable endpoints — ClusterIP for internal, LoadBalancer for external
- Ingress is L7 — host/path routing, TLS termination
- Network Policies default allow — you must explicitly restrict
- DNS is automatic —
service.namespace.svc.cluster.local
Understanding these layers makes debugging much easier. When something breaks, you can isolate which layer is the problem: Pod networking, Service routing, Ingress rules, or DNS resolution. 🌍