Fix: Kubernetes Pod Stuck in CrashLoopBackOff

Your pod is stuck in CrashLoopBackOff. Kubernetes keeps restarting it, each time waiting longer before trying again. Here’s how to diagnose and fix it.

What CrashLoopBackOff Actually Means

CrashLoopBackOff isn’t the error itself — it’s Kubernetes telling you “this container keeps crashing, so I’m backing off on restarts.”

The actual problem is that your container exits with a non-zero exit code. Kubernetes notices, restarts it, it crashes again, and the backoff timer increases: 10s, 20s, 40s, up to 5 minutes.

Step 1: Check Pod Status

Start with the basics:

1
kubectl get pods

The RESTARTS column tells you how many times it’s crashed. High numbers mean this has been going on for a while.

Step 2: Get Pod Details

1
kubectl describe pod my-app-5d4f7b8c9-x2k4j

Scroll to the Events section at the bottom:

Also check the Last State section for the exit code:

Exit codes matter:

Exit Code 1: Generic error (application crashed)
Exit Code 137: OOMKilled (out of memory)
Exit Code 139: Segmentation fault
Exit Code 143: SIGTERM (graceful shutdown, usually not an error)

Step 3: Check the Logs

This is where you’ll usually find the answer:

1
kubectl logs my-app-5d4f7b8c9-x2k4j

If the container already crashed, get logs from the previous instance:

1
kubectl logs my-app-5d4f7b8c9-x2k4j --previous

Common Causes and Fixes

1. Application Error on Startup

Symptom: Logs show your application throwing an exception

Fix: Check your Dockerfile’s CMD or ENTRYPOINT. The path might be wrong, or files weren’t copied correctly.

1
2
3
4
5
# Wrong
CMD ["node", "server.js"]

# Right - specify full path
CMD ["node", "/app/server.js"]

2. Missing Environment Variables

Symptom: Logs show config errors or undefined variables

Fix: Add the missing environment variable to your deployment:

1
2
3
4
5
6
7
8
9
spec:
  containers:
  - name: my-app
    env:
    - name: DATABASE_URL
      valueFrom:
        secretKeyRef:
          name: db-credentials
          key: url

3. Database/Service Not Ready

Symptom: Connection refused errors on startup

Fix: Add an init container to wait for dependencies:

1
2
3
4
5
6
7
8
9
spec:
  initContainers:
  - name: wait-for-db
    image: busybox
    command: ['sh', '-c', 
      'until nc -z postgres-service 5432; do echo waiting for db; sleep 2; done']
  containers:
  - name: my-app
    # ...

4. Out of Memory (OOMKilled)

Symptom: Exit code 137, or describe pod shows OOMKilled

Fix: Increase memory limits:

1
2
3
4
5
6
7
8
spec:
  containers:
  - name: my-app
    resources:
      requests:
        memory: "256Mi"
      limits:
        memory: "512Mi"  # Increase this

5. Liveness Probe Failing

Symptom: Pod keeps restarting even though logs look fine

Kubernetes might be killing your pod because the liveness probe fails. Check your probe configuration:

1
2
3
4
5
6
7
livenessProbe:
  httpGet:
    path: /health
    port: 8080
  initialDelaySeconds: 30  # Give app time to start
  periodSeconds: 10
  failureThreshold: 3

If your app takes time to initialize, increase initialDelaySeconds.

6. Wrong Command or Entrypoint

Symptom: Container exits immediately with no logs

Your container might be running the wrong command and exiting instantly.

Debug by running the container interactively:

1
kubectl run debug --rm -it --image=your-image -- /bin/sh

Then manually run your start command to see what happens.

Quick Debugging Checklist

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# 1. What's the status?
kubectl get pods

# 2. What do events say?
kubectl describe pod <pod-name> | tail -20

# 3. What do logs say?
kubectl logs <pod-name> --previous

# 4. What's the exit code?
kubectl get pod <pod-name> -o jsonpath='{.status.containerStatuses[0].lastState.terminated.exitCode}'

# 5. Is it OOMKilled?
kubectl get pod <pod-name> -o jsonpath='{.status.containerStatuses[0].lastState.terminated.reason}'

The Fix That Works 90% of the Time

Most CrashLoopBackOff issues are one of these:

Check the logs (kubectl logs --previous) — the error message tells you what’s wrong
Verify environment variables — missing config is extremely common
Increase memory limits — if exit code is 137
Fix liveness probes — add proper initialDelaySeconds

The container logs almost always have the answer. Start there.

What CrashLoopBackOff Actually Means#

Step 1: Check Pod Status#

Step 2: Get Pod Details#

Step 3: Check the Logs#

Common Causes and Fixes#

1. Application Error on Startup#

2. Missing Environment Variables#

3. Database/Service Not Ready#

4. Out of Memory (OOMKilled)#

5. Liveness Probe Failing#

6. Wrong Command or Entrypoint#

Quick Debugging Checklist#

The Fix That Works 90% of the Time#

📬 Get the Newsletter