Kubernetes networking confuses everyone at first. Pods, Services, Ingresses, CNIs — it’s a lot. Here’s how it actually works, layer by layer.

The Fundamental Model

Kubernetes networking has three simple rules:

  1. Every Pod gets its own IP address
  2. Pods can communicate with any other Pod without NAT
  3. Agents on a node can communicate with all Pods on that node

That’s it. Everything else is implementation detail.

Layer 1: Pod Networking

Each Pod gets an IP from the cluster’s Pod CIDR (e.g., 10.244.0.0/16). This isn’t magic — your CNI plugin handles it.

How CNI Works

When a Pod is created:

  1. Kubelet calls the CNI plugin
  2. CNI creates a virtual ethernet pair (veth)
  3. One end goes in the Pod’s network namespace
  4. Other end connects to a bridge or overlay network
  5. CNI assigns an IP and configures routes
1
2
3
4
5
# See the veth pairs on a node
ip link show type veth

# Check the Pod's network namespace
kubectl exec -it my-pod -- ip addr

Common CNI Plugins

Calico: Uses BGP for routing. No overlay, pure L3. Great performance.

Flannel: Simple overlay with VXLAN. Easy to set up, slightly more overhead.

Cilium: eBPF-based. Advanced features like network policies, observability. Modern choice.

1
2
3
4
5
6
# Example Cilium installation
helm install cilium cilium/cilium \
  --namespace kube-system \
  --set kubeProxyReplacement=strict \
  --set k8sServiceHost=${API_SERVER_IP} \
  --set k8sServicePort=6443

Layer 2: Service Discovery

Pods are ephemeral. Their IPs change. Services provide stable endpoints.

ClusterIP (Default)

Creates a virtual IP that load-balances to matching Pods:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
apiVersion: v1
kind: Service
metadata:
  name: my-app
spec:
  selector:
    app: my-app
  ports:
  - port: 80
    targetPort: 8080

This creates:

  • A ClusterIP (e.g., 10.96.45.12)
  • DNS entry (my-app.default.svc.cluster.local)
  • iptables/IPVS rules for load balancing
1
2
3
4
5
6
7
8
# Check the Service IP
kubectl get svc my-app

# See the iptables rules (kube-proxy in iptables mode)
iptables -t nat -L KUBE-SERVICES | grep my-app

# Or IPVS rules (kube-proxy in ipvs mode)
ipvsadm -Ln | grep -A3 10.96.45.12

How kube-proxy Works

kube-proxy watches Services and Endpoints, then programs:

iptables mode: NAT rules that DNAT traffic to Pod IPs. Simple but doesn’t scale well (O(n) rule matching).

IPVS mode: Uses Linux IPVS for L4 load balancing. Better performance at scale, supports more algorithms (round-robin, least connections, etc.).

eBPF mode (Cilium): Replaces kube-proxy entirely. Fastest option, direct Pod-to-Pod without NAT.

NodePort

Exposes Service on every node’s IP at a static port:

1
2
3
4
5
6
spec:
  type: NodePort
  ports:
  - port: 80
    targetPort: 8080
    nodePort: 30080  # Available on all nodes

Traffic flow: NodeIP:30080 → kube-proxy → Pod

LoadBalancer

Cloud provider creates an external load balancer:

1
2
3
4
5
spec:
  type: LoadBalancer
  ports:
  - port: 80
    targetPort: 8080

Cloud controller provisions an ELB/ALB/NLB that points to NodePorts.

Layer 3: Ingress

Services handle L4 (TCP/UDP). Ingress handles L7 (HTTP/HTTPS).

Basic Ingress

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-ingress
spec:
  ingressClassName: nginx
  rules:
  - host: app.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: my-app
            port:
              number: 80

How It Works

  1. Ingress Controller (nginx, traefik, etc.) watches Ingress resources
  2. Generates config (nginx.conf) based on rules
  3. Routes traffic based on Host header and path
  4. Terminates TLS if configured
1
2
3
4
5
spec:
  tls:
  - hosts:
    - app.example.com
    secretName: app-tls-secret

Gateway API (The Future)

Ingress is being superseded by Gateway API. More expressive, role-oriented:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: my-gateway
spec:
  gatewayClassName: nginx
  listeners:
  - name: http
    port: 80
    protocol: HTTP
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: my-route
spec:
  parentRefs:
  - name: my-gateway
  hostnames:
  - "app.example.com"
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /api
    backendRefs:
    - name: api-service
      port: 80

Network Policies

By default, all Pods can talk to all other Pods. Network Policies restrict this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: api-policy
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: api
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - port: 8080
  egress:
  - to:
    - podSelector:
        matchLabels:
          app: database
    ports:
    - port: 5432

This says: api Pods can only receive traffic from frontend Pods on 8080, and can only send traffic to database Pods on 5432.

Important: Your CNI must support NetworkPolicy. Flannel doesn’t. Calico and Cilium do.

DNS

CoreDNS provides cluster DNS. Every Pod gets a DNS config pointing to it:

1
2
3
4
# Inside a Pod
cat /etc/resolv.conf
# nameserver 10.96.0.10
# search default.svc.cluster.local svc.cluster.local cluster.local

DNS records created:

  • service-name.namespace.svc.cluster.local → ClusterIP
  • pod-ip.namespace.pod.cluster.local → Pod IP
  • Headless Services → A records for each Pod
1
2
3
4
5
# Headless Service (no ClusterIP)
spec:
  clusterIP: None
  selector:
    app: my-stateful-app

Useful for StatefulSets where you need stable DNS per Pod.

Debugging Network Issues

Pod Can’t Reach Service

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# Check Service exists and has endpoints
kubectl get svc my-service
kubectl get endpoints my-service

# No endpoints? Check selector matches Pod labels
kubectl get pods -l app=my-app

# Test DNS resolution
kubectl exec -it debug-pod -- nslookup my-service

# Test connectivity
kubectl exec -it debug-pod -- curl my-service:80

Pod Can’t Reach External

1
2
3
4
5
6
7
# Check Pod has internet access
kubectl exec -it my-pod -- curl -I https://google.com

# If not, check:
# 1. Network Policy blocking egress?
# 2. Node has internet access?
# 3. CNI configured for external traffic?

Service Timeouts

1
2
3
4
5
6
7
8
9
# Check if Pods are healthy
kubectl get pods -l app=my-app
kubectl describe pod <pod-name>

# Check readiness probes
kubectl get endpoints my-service -o yaml

# Check kube-proxy is running
kubectl get pods -n kube-system -l k8s-app=kube-proxy

Performance Tuning

Use IPVS Instead of iptables

1
2
3
4
# In kube-proxy ConfigMap
mode: ipvs
ipvs:
  scheduler: lc  # least connections

Consider Cilium with eBPF

Replaces kube-proxy, skips iptables entirely:

1
2
helm upgrade cilium cilium/cilium \
  --set kubeProxyReplacement=strict

Optimize CoreDNS

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# Add caching, enable autopath
apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
data:
  Corefile: |
    .:53 {
        cache 30
        autopath @kubernetes
        kubernetes cluster.local in-addr.arpa ip6.arpa {
          pods insecure
          fallthrough in-addr.arpa ip6.arpa
        }
        forward . /etc/resolv.conf
    }

Key Takeaways

  1. CNI handles Pod IPs — choose based on features needed (Calico for NetworkPolicy, Cilium for eBPF)
  2. Services provide stable endpoints — ClusterIP for internal, LoadBalancer for external
  3. Ingress is L7 — host/path routing, TLS termination
  4. Network Policies default allow — you must explicitly restrict
  5. DNS is automaticservice.namespace.svc.cluster.local

Understanding these layers makes debugging much easier. When something breaks, you can isolate which layer is the problem: Pod networking, Service routing, Ingress rules, or DNS resolution. 🌍