That 1.2GB Python image you’re pushing to production? It contains gcc, make, and half of Debian’s package repository. Your application needs none of it at runtime.
Container image optimization isn’t just about saving disk space—it’s about security (smaller attack surface), speed (faster pulls and deploys), and cost (less bandwidth and storage). Let’s fix it.
The Problem: Development vs Runtime#
A typical Dockerfile grows organically:
1
2
3
4
5
6
7
8
9
| # The bloated approach
FROM python:3.11
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["python", "app.py"]
|
This image includes:
- Python interpreter ✓ (needed)
- Your application ✓ (needed)
- pip, setuptools, wheel (not needed at runtime)
- gcc, make, libc-dev (not needed at runtime)
- Bash, coreutils, apt (not needed at runtime)
- 400MB of Debian packages (definitely not needed)
Multi-Stage Builds: The Foundation#
Separate build-time dependencies from runtime:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
| # Stage 1: Build
FROM python:3.11 AS builder
WORKDIR /app
# Install build dependencies
RUN pip install --no-cache-dir pip-tools
# Create virtual environment
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Stage 2: Runtime
FROM python:3.11-slim
WORKDIR /app
# Copy only the virtual environment
COPY --from=builder /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# Copy application code
COPY app/ ./app/
# Run as non-root
RUN useradd -r -s /bin/false appuser
USER appuser
CMD ["python", "-m", "app.main"]
|
Result: ~150MB instead of ~1.2GB.
Going Further: Distroless Images#
Google’s distroless images contain only your application and its runtime dependencies—no shell, no package manager, no unnecessary utilities:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
| # Stage 1: Build
FROM python:3.11 AS builder
WORKDIR /app
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Stage 2: Distroless runtime
FROM gcr.io/distroless/python3-debian12
WORKDIR /app
COPY --from=builder /opt/venv /opt/venv
COPY app/ ./app/
ENV PYTHONPATH=/opt/venv/lib/python3.11/site-packages
CMD ["app/main.py"]
|
Result: ~75MB. No shell access for attackers.
For Compiled Languages: Even Smaller#
Go and Rust can produce static binaries that need almost nothing:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
| # Go example
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -ldflags="-s -w" -o /app/server
# Scratch = empty image
FROM scratch
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
COPY --from=builder /app/server /server
ENTRYPOINT ["/server"]
|
Result: 8-15MB depending on your application.
Layer Optimization#
Docker layers are cached independently. Order matters:
1
2
3
4
5
6
7
8
| # Bad: Cache invalidated on every code change
COPY . .
RUN pip install -r requirements.txt
# Good: Dependencies cached separately
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
|
Combine RUN commands to reduce layers:
1
2
3
4
5
6
7
8
9
10
11
| # Bad: 4 layers
RUN apt-get update
RUN apt-get install -y curl
RUN apt-get clean
RUN rm -rf /var/lib/apt/lists/*
# Good: 1 layer, smaller
RUN apt-get update && \
apt-get install -y --no-install-recommends curl && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
|
The .dockerignore File#
Stop copying unnecessary files:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
| # .dockerignore
.git
.gitignore
__pycache__
*.pyc
.pytest_cache
.mypy_cache
.coverage
htmlcov/
.env
.env.*
*.md
!README.md
Dockerfile*
docker-compose*
.dockerignore
tests/
docs/
*.log
.vscode/
.idea/
|
Security Scanning#
Smaller images have fewer vulnerabilities. Scan regardless:
1
2
3
4
5
6
7
8
| # Using Trivy
trivy image myapp:latest
# Using Docker Scout
docker scout cves myapp:latest
# Using Grype
grype myapp:latest
|
Integrate into CI:
1
2
3
4
5
6
7
| # .github/workflows/security.yml
- name: Scan image
uses: aquasecurity/trivy-action@master
with:
image-ref: 'myapp:${{ github.sha }}'
exit-code: '1'
severity: 'CRITICAL,HIGH'
|
Complete Example: Python FastAPI#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
| # syntax=docker/dockerfile:1.4
# Stage 1: Dependencies
FROM python:3.11-slim AS deps
WORKDIR /app
RUN pip install --no-cache-dir pip-tools
COPY requirements.in .
RUN pip-compile requirements.in -o requirements.txt && \
pip install --no-cache-dir -r requirements.txt --target=/deps
# Stage 2: Production
FROM gcr.io/distroless/python3-debian12
WORKDIR /app
ENV PYTHONPATH=/deps
COPY --from=deps /deps /deps
COPY app/ ./app/
EXPOSE 8000
CMD ["app.main:app", "--host", "0.0.0.0", "--port", "8000"]
|
Build and verify:
1
2
3
4
5
6
7
8
9
10
| # Build
docker build -t myapp:optimized .
# Check size
docker images myapp:optimized
# REPOSITORY TAG SIZE
# myapp optimized 47MB
# Compare to naive build
# myapp naive 1.2GB
|
Quick Wins Checklist#
Conclusion#
A 1.2GB image that takes 45 seconds to pull becomes a 45MB image that pulls in 2 seconds. Your CI runs faster, your deploys are quicker, and your attack surface is 95% smaller.
The best part? These techniques take minutes to implement but pay dividends on every single deployment.
Start with multi-stage builds. Add distroless when you’re comfortable. Your future self (and your cluster) will thank you.