Containers

Docker Multi-Stage Builds: Smaller Images, Faster Deploys

Your Docker images are probably too big. Build tools, dev dependencies, source files—all shipping to production when only the compiled binary matters. Multi-stage builds fix this by separating build environment from runtime environment. The Problem A typical single-stage Dockerfile: 1 2 3 4 5 6 7 8 9 FROM python:3.11 WORKDIR /app COPY requirements.txt . RUN pip install -r requirements.txt COPY . . CMD ["python", "app.py"] This image includes: ...

Kubernetes Troubleshooting Patterns for Production

Kubernetes hides complexity until something breaks. Then you need to know where to look. Here’s a systematic approach to debugging production issues. The Debugging Hierarchy Start broad, narrow down: Cluster level: Nodes healthy? Resources available? Namespace level: Deployments running? Services configured? Pod level: Containers starting? Logs clean? Container level: Process running? Resources sufficient? Quick Health Check 1 2 3 4 5 6 7 8 9 10 11 # Node status kubectl get nodes -o wide # All pods across namespaces kubectl get pods -A # Pods not running kubectl get pods -A | grep -v Running | grep -v Completed # Events (recent issues) kubectl get events -A --sort-by='.lastTimestamp' | tail -20 Pod Troubleshooting Pod States State Meaning Check Pending Can’t be scheduled Resources, node selectors, taints ContainerCreating Image pulling or volume mounting Image name, pull secrets, PVCs CrashLoopBackOff Container crashing repeatedly Logs, resource limits, probes ImagePullBackOff Can’t pull image Image name, registry auth Error Container exited with error Logs Pending Pods 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # Why is it pending? kubectl describe pod my-pod # Look for: # - Insufficient cpu/memory # - No nodes match nodeSelector # - Taints not tolerated # - PVC not bound # Check node resources kubectl describe nodes | grep -A5 "Allocated resources" # Check PVC status kubectl get pvc CrashLoopBackOff 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # Get logs from current container kubectl logs my-pod # Get logs from previous (crashed) container kubectl logs my-pod --previous # Get logs from specific container kubectl logs my-pod -c my-container # Follow logs kubectl logs -f my-pod # Last N lines kubectl logs --tail=100 my-pod Common causes: ...

Docker Compose Patterns for Local Development

Docker Compose turns “works on my machine” into “works everywhere.” Here’s how to structure it for real development workflows. Basic Structure 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 # docker-compose.yml services: app: build: . ports: - "3000:3000" volumes: - .:/app environment: - NODE_ENV=development db: image: postgres:15 environment: POSTGRES_PASSWORD: devpass Start everything: ...

Container Health Check Patterns That Actually Work

Your container says it’s healthy. Your users say the app is broken. Sound familiar? Basic health checks only tell you if a process is running. Here’s how to build checks that catch real problems. Beyond “Is It Alive?” Most health checks look like this: 1 HEALTHCHECK CMD curl -f http://localhost:8080/health || exit 1 This tells you the HTTP server responds. It doesn’t tell you: Can the app reach the database? Is the cache connected? Are critical background workers running? Is the disk filling up? Layered Health Checks Implement three levels: ...

Docker Multi-Stage Builds: Smaller Images, Cleaner Deploys

Your Docker images are probably too big. Mine were. Then I learned about multi-stage builds. The problem is simple: build tools bloat production images. You need Node.js and npm to build your React app, but you only need nginx to serve it. You need Go and its toolchain to compile, but the binary runs standalone. Every megabyte of build tooling in your production image is wasted space, slower deploys, and expanded attack surface. ...

Docker Compose for Production: Beyond the Tutorial

Docker Compose tutorials show you docker compose up. Production requires health checks, resource limits, proper logging, restart policies, and deployment strategies. Here’s how to bridge that gap. Base Configuration 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 # docker-compose.yml version: "3.8" services: app: image: myapp:${VERSION:-latest} build: context: . dockerfile: Dockerfile restart: unless-stopped environment: - NODE_ENV=production env_file: - .env ports: - "3000:3000" This is a starting point. Let’s make it production-ready. ...

Kubernetes Debugging: From Pod Failures to Cluster Issues

Kubernetes abstracts away infrastructure until something breaks. Then you need to peel back the layers. These debugging patterns will help you find problems fast. First Steps: Get the Lay of the Land 1 2 3 4 5 6 7 8 9 10 # Cluster health kubectl cluster-info kubectl get nodes kubectl top nodes # Namespace overview kubectl get all -n myapp # Events (recent issues surface here) kubectl get events -n myapp --sort-by='.lastTimestamp' Pod Debugging Check Pod Status 1 2 3 4 5 6 7 8 9 10 11 12 13 # List pods with status kubectl get pods -n myapp # Detailed pod info kubectl describe pod mypod -n myapp # Common status meanings: # Pending - Waiting for scheduling or image pull # Running - At least one container running # Succeeded - All containers completed successfully # Failed - All containers terminated, at least one failed # CrashLoopBackOff - Container crashing repeatedly # ImagePullBackOff - Can't pull container image View Logs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 # Current logs kubectl logs mypod -n myapp # Previous container (after crash) kubectl logs mypod -n myapp --previous # Follow logs kubectl logs -f mypod -n myapp # Specific container (multi-container pod) kubectl logs mypod -n myapp -c mycontainer # Last N lines kubectl logs mypod -n myapp --tail=100 # Since timestamp kubectl logs mypod -n myapp --since=1h Execute Commands in Pod 1 2 3 4 5 6 7 8 # Shell into running container kubectl exec -it mypod -n myapp -- /bin/sh # Run specific command kubectl exec mypod -n myapp -- cat /etc/config/app.yaml # Specific container kubectl exec -it mypod -n myapp -c mycontainer -- /bin/sh Debug Crashed Containers 1 2 3 4 5 6 7 8 # Check why it crashed kubectl describe pod mypod -n myapp | grep -A 10 "Last State" # View previous logs kubectl logs mypod -n myapp --previous # Run debug container (K8s 1.25+) kubectl debug mypod -n myapp -it --image=busybox --target=mycontainer Common Pod Issues ImagePullBackOff 1 2 3 4 5 6 7 8 9 10 11 12 13 # Check events for details kubectl describe pod mypod -n myapp | grep -A 5 Events # Common causes: # - Wrong image name/tag # - Private registry without imagePullSecrets # - Registry rate limiting (Docker Hub) # Verify image exists docker pull myimage:tag # Check imagePullSecrets kubectl get pod mypod -n myapp -o jsonpath='{.spec.imagePullSecrets}' CrashLoopBackOff 1 2 3 4 5 6 7 8 9 10 11 12 # Get exit code kubectl describe pod mypod -n myapp | grep "Exit Code" # Exit codes: # 0 - Success (shouldn't be crashing) # 1 - Application error # 137 - OOMKilled (out of memory) # 139 - Segmentation fault # 143 - SIGTERM (graceful shutdown) # Check resource limits kubectl describe pod mypod -n myapp | grep -A 5 Limits Pending Pods 1 2 3 4 5 6 7 8 9 10 11 # Check why not scheduled kubectl describe pod mypod -n myapp | grep -A 10 Events # Common causes: # - Insufficient resources # - Node selector/affinity not matched # - Taints without tolerations # - PVC not bound # Check node resources kubectl describe nodes | grep -A 5 "Allocated resources" Resource Issues Memory Problems 1 2 3 4 5 6 7 8 # Check pod resource usage kubectl top pod mypod -n myapp # Check for OOMKilled kubectl describe pod mypod -n myapp | grep OOMKilled # View memory limits kubectl get pod mypod -n myapp -o jsonpath='{.spec.containers[*].resources}' CPU Throttling 1 2 3 4 5 # Check CPU usage vs limits kubectl top pod mypod -n myapp # In container, check throttling kubectl exec mypod -n myapp -- cat /sys/fs/cgroup/cpu/cpu.stat Networking Debugging Service Connectivity 1 2 3 4 5 6 7 8 9 10 11 12 # Check service exists kubectl get svc -n myapp # Check endpoints (are pods backing the service?) kubectl get endpoints myservice -n myapp # Test from within cluster kubectl run debug --rm -it --image=busybox -- /bin/sh # Then: wget -qO- http://myservice.myapp.svc.cluster.local # DNS resolution kubectl run debug --rm -it --image=busybox -- nslookup myservice.myapp.svc.cluster.local Pod-to-Pod Communication 1 2 3 4 5 6 7 8 # Get pod IPs kubectl get pods -n myapp -o wide # Test connectivity from one pod to another kubectl exec mypod1 -n myapp -- wget -qO- http://10.0.0.5:8080 # Check network policies kubectl get networkpolicies -n myapp Ingress Issues 1 2 3 4 5 6 7 8 # Check ingress configuration kubectl describe ingress myingress -n myapp # Check ingress controller logs kubectl logs -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx # Verify backend service kubectl get svc myservice -n myapp ConfigMaps and Secrets 1 2 3 4 5 6 7 8 9 10 11 12 # Verify ConfigMap exists and has expected data kubectl get configmap myconfig -n myapp -o yaml # Check if mounted correctly kubectl exec mypod -n myapp -- ls -la /etc/config/ kubectl exec mypod -n myapp -- cat /etc/config/app.yaml # Verify Secret kubectl get secret mysecret -n myapp -o jsonpath='{.data.password}' | base64 -d # Check environment variables kubectl exec mypod -n myapp -- env | grep MY_VAR Persistent Volumes 1 2 3 4 5 6 7 8 9 10 11 12 # Check PVC status kubectl get pvc -n myapp # Describe for binding issues kubectl describe pvc mypvc -n myapp # Check PV kubectl get pv # Verify mount in pod kubectl exec mypod -n myapp -- df -h kubectl exec mypod -n myapp -- ls -la /data Node Issues 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 # Node status kubectl get nodes kubectl describe node mynode # Check conditions kubectl get nodes -o custom-columns=NAME:.metadata.name,CONDITIONS:.status.conditions[*].type # Node resource pressure kubectl describe node mynode | grep -A 5 Conditions # Pods on specific node kubectl get pods --all-namespaces -o wide --field-selector spec.nodeName=mynode # Drain node for maintenance kubectl drain mynode --ignore-daemonsets --delete-emptydir-data Control Plane Debugging 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # API server health kubectl get --raw='/healthz' # Component status (deprecated but useful) kubectl get componentstatuses # Check system pods kubectl get pods -n kube-system # API server logs (if accessible) kubectl logs -n kube-system kube-apiserver-master # etcd health kubectl exec -n kube-system etcd-master -- etcdctl endpoint health Useful Debug Containers 1 2 3 4 5 6 7 8 # Network debugging kubectl run netdebug --rm -it --image=nicolaka/netshoot -- /bin/bash # DNS debugging kubectl run dnsdebug --rm -it --image=tutum/dnsutils -- /bin/bash # General debugging kubectl run debug --rm -it --image=busybox -- /bin/sh Systematic Debugging Checklist Events first: kubectl get events --sort-by='.lastTimestamp' Describe the resource: kubectl describe <resource> <name> Check logs: kubectl logs <pod> (and --previous) Verify dependencies: ConfigMaps, Secrets, Services, PVCs Check resources: CPU, memory limits and usage Test connectivity: DNS, service endpoints, network policies Compare with working: Diff against known good configuration Quick Reference 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 # Pod not starting kubectl describe pod <name> kubectl get events # Pod crashing kubectl logs <pod> --previous kubectl describe pod <name> | grep "Exit Code" # Can't connect to service kubectl get endpoints <service> kubectl run debug --rm -it --image=busybox -- wget -qO- http://<service> # Resource issues kubectl top pods kubectl describe node | grep -A 5 "Allocated" # Config issues kubectl exec <pod> -- env kubectl exec <pod> -- cat /path/to/config Kubernetes debugging is methodical. Start with events, drill into describe output, check logs, and verify each dependency. Most issues are configuration mismatches—wrong image tags, missing secrets, insufficient resources. ...

Docker Compose Patterns for Production-Ready Services

Docker Compose bridges the gap between single-container development and full orchestration. These patterns will help you build maintainable, production-ready configurations. Project Structure m ├ ├ ├ ├ ├ ├ └ y ─ ─ ─ ─ ─ ─ ─ p ─ ─ ─ ─ ─ ─ ─ r o d d d d . . s ├ │ ├ │ └ j o o o o e e e ─ ─ ─ e c c c c n n r ─ ─ ─ c k k k k v v v t e e e e . i a └ w └ n ├ └ / r r r r e c p ─ o ─ g ─ ─ - - - - x e p ─ r ─ i ─ ─ c c c c a s k n o o o o m D e D x D n m m m m p o r / o g p p p p l c c c i o o o o e k k k n s s s s e e e x e e e e r r r . . . . . f f f c y o p t i i i o m v r e l l l n l e o s e e e f r d t r . . i y y d m m e l l . y m l # # # # # # B D P T E T a e r e n e s v o s v m e d t i p u r l c v c c o a o e t o n t n r i n m e f r o f e i i n i n ( g d g t c u e u o r s v r v m a e a a m t ( r t r i i a r i i t o u i o a t n t d n b e o e l d - s e ) l s o a d e d ) Base Configuration 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 # docker-compose.yml version: "3.8" services: app: build: context: ./services/app dockerfile: Dockerfile environment: - DATABASE_URL=${DATABASE_URL} - REDIS_URL=${REDIS_URL} depends_on: db: condition: service_healthy redis: condition: service_started networks: - backend restart: unless-stopped db: image: postgres:15-alpine environment: POSTGRES_DB: ${DB_NAME} POSTGRES_USER: ${DB_USER} POSTGRES_PASSWORD: ${DB_PASSWORD} volumes: - postgres_data:/var/lib/postgresql/data healthcheck: test: ["CMD-SHELL", "pg_isready -U ${DB_USER}"] interval: 10s timeout: 5s retries: 5 networks: - backend redis: image: redis:7-alpine volumes: - redis_data:/data networks: - backend networks: backend: driver: bridge volumes: postgres_data: redis_data: Development Overrides 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 # docker-compose.override.yml (auto-loaded with docker-compose up) version: "3.8" services: app: build: target: development volumes: - ./src:/app/src:cached - /app/node_modules ports: - "3000:3000" - "9229:9229" # Debugger environment: - DEBUG=true - LOG_LEVEL=debug command: npm run dev db: ports: - "5432:5432" redis: ports: - "6379:6379" Production Configuration 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 # docker-compose.prod.yml version: "3.8" services: app: build: target: production deploy: replicas: 3 resources: limits: cpus: '1' memory: 512M reservations: cpus: '0.5' memory: 256M restart_policy: condition: on-failure delay: 5s max_attempts: 3 environment: - NODE_ENV=production - LOG_LEVEL=info logging: driver: json-file options: max-size: "10m" max-file: "3" nginx: image: nginx:alpine ports: - "80:80" - "443:443" volumes: - ./services/nginx/nginx.conf:/etc/nginx/nginx.conf:ro - ./certs:/etc/nginx/certs:ro depends_on: - app networks: - backend - frontend networks: frontend: driver: bridge Run with: ...

Docker Compose Patterns: From Development to Production

Docker Compose started as a development tool but has grown into something usable for small production deployments. Here are patterns that make it work well in both contexts. Environment-Specific Overrides Base configuration with environment-specific overrides: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 # docker-compose.yml (base) services: app: image: myapp:${VERSION:-latest} environment: - DATABASE_URL depends_on: - db db: image: postgres:15 volumes: - postgres_data:/var/lib/postgresql/data volumes: postgres_data: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 # docker-compose.override.yml (auto-loaded in dev) services: app: build: . volumes: - .:/app - /app/node_modules environment: - DEBUG=true ports: - "3000:3000" db: ports: - "5432:5432" 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 # docker-compose.prod.yml services: app: restart: always deploy: replicas: 2 resources: limits: cpus: '0.5' memory: 512M logging: driver: json-file options: max-size: "10m" max-file: "3" db: restart: always Usage: ...

Container Orchestration Patterns: Beyond 'Just Deploy It'

Running one container is easy. Running hundreds in production, reliably, at scale? That’s where patterns emerge. These aren’t Kubernetes-specific (though that’s where you’ll see them most). They’re fundamental approaches to composing containers into systems that actually work. The Sidecar Pattern A sidecar is a helper container that runs alongside your main application container, sharing the same pod/network namespace. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 apiVersion: v1 kind: Pod metadata: name: web-app spec: containers: # Main application - name: app image: myapp:1.0 ports: - containerPort: 8080 # Sidecar: log shipper - name: log-shipper image: fluentd:latest volumeMounts: - name: logs mountPath: /var/log/app volumes: - name: logs emptyDir: {} Common sidecar use cases: ...