Secrets Management Done Right

Every developer has done it. Committed an API key to git, pushed to GitHub, and watched in horror as the secret scanner flagged it within minutes. If you’re lucky, the service revokes the key automatically. If you’re not, someone’s crypto-mining on your AWS account.

Secrets management isn’t glamorous, but getting it wrong is expensive.

The Problem Space

Secrets include:

API keys and tokens
Database credentials
Encryption keys
TLS certificates
OAuth client secrets
SSH keys
Signing keys

These all share properties: they’re sensitive, they need rotation, and they need to reach your application somehow without being exposed.

Level 0: Don’t Do This

1
2
3
# app.py
DATABASE_URL = "postgres://admin:SuperSecret123@db.example.com/prod"
API_KEY = "sk_live_abc123xyz"

Hardcoded secrets in source code. Committed to git. Available to everyone with repo access forever (git history persists).

This is where everyone starts. The goal is to not stay here.

Level 1: Environment Variables

1
2
3
# .env (gitignored)
DATABASE_URL=postgres://admin:SuperSecret123@db.example.com/prod
API_KEY=sk_live_abc123xyz

1
2
3
import os
DATABASE_URL = os.environ["DATABASE_URL"]
API_KEY = os.environ["API_KEY"]

Better. Secrets aren’t in git. But:

.env files get copy-pasted between developers
No audit trail of who accessed what
Rotation means updating every deployment
Secrets still exist as plaintext somewhere

For personal projects and early startups, this is often good enough. Know its limitations.

Level 2: Encrypted Secrets Files

1
2
3
4
5
6
# Create encrypted secrets
sops -e secrets.yaml > secrets.enc.yaml

# secrets.enc.yaml can be committed to git safely
# Decrypt at runtime
sops -d secrets.enc.yaml > secrets.yaml

Tools like SOPS, git-crypt, or age encrypt secrets before commit.

Pros:

Secrets live in git (versioned, auditable)
Only people with the decryption key can read them
Works offline

Cons:

Key management for the encryption key itself
Rotation still requires commits
Everyone with the key sees all secrets

Level 3: Secrets Manager Services

AWS Secrets Manager, HashiCorp Vault, Google Secret Manager, Azure Key Vault—these are purpose-built secret stores.

1
2
3
4
5
6
7
8
import boto3

def get_secret(secret_name: str) -> str:
    client = boto3.client('secretsmanager')
    response = client.get_secret_value(SecretId=secret_name)
    return response['SecretString']

DATABASE_URL = get_secret("prod/database/url")

Pros:

Centralized management
Fine-grained access control (IAM)
Audit logs
Automatic rotation (for supported services)
No secrets in git or environment

Cons:

Network dependency at startup
Cost (small, but non-zero)
Vendor lock-in
Complexity

Practical Patterns

The Startup Bootstrap Problem

Your secrets manager needs credentials to access. Those credentials are… secrets. It’s turtles all the way down.

Solutions:

IAM roles (AWS): Instance/container gets permissions from its identity, no credentials needed
Workload identity (GCP/K8s): Similar concept for Kubernetes
Machine identity (Vault): Bootstrap with a one-time token

1
2
3
4
5
# AWS Lambda or ECS with IAM role - no credentials in code
import boto3

# This "just works" because the execution environment has IAM permissions
client = boto3.client('secretsmanager')

Caching Secrets

Don’t fetch from the secrets manager on every request:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
from functools import lru_cache
from datetime import datetime, timedelta

class SecretCache:
    def __init__(self, client, ttl_seconds=300):
        self.client = client
        self.ttl = timedelta(seconds=ttl_seconds)
        self.cache = {}
    
    def get(self, secret_name: str) -> str:
        now = datetime.now()
        
        if secret_name in self.cache:
            value, fetched_at = self.cache[secret_name]
            if now - fetched_at < self.ttl:
                return value
        
        value = self.client.get_secret_value(SecretId=secret_name)['SecretString']
        self.cache[secret_name] = (value, now)
        return value

Balance: Short TTL means more API calls but faster rotation pickup. Long TTL means fewer calls but stale secrets during rotation.

Rotation Without Downtime

The naive approach breaks:

Rotate database password
Old password stops working immediately
Running applications fail until redeployed

The correct approach:

Add new password (both old and new work)
Deploy applications with new password
Remove old password

For databases, this means:

1
2
3
4
-- Step 1: Add new credentials
ALTER USER app_user SET PASSWORD 'new_password';
-- Wait for deploys...
-- Step 3: Revoke old access (if using separate users)

AWS Secrets Manager has built-in rotation Lambdas for RDS that handle this.

Configuration vs Secrets

Not everything sensitive is a secret. Consider:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# config.yaml (not secret, but environment-specific)
database:
  host: db.prod.internal
  port: 5432
  name: myapp
  pool_size: 20

# secrets (actual secrets)
database:
  username: # from secrets manager
  password: # from secrets manager

Mixing config and secrets in the same place causes:

Overly restrictive access to non-sensitive config
Treating non-secrets as secrets (unnecessary complexity)
Bloated secret stores

Keep them separate.

The Twelve-Factor Approach

Twelve-Factor App says: store config in environment variables.

This is good advice with caveats:

Environment variables are visible to the process and its children
They appear in /proc/<pid>/environ on Linux
Crash dumps and debug logs might capture them

For truly sensitive secrets, consider:

Fetching directly from a secrets manager
Using short-lived credentials
Memory-only storage with secure cleanup

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
import os
import secrets as crypto_secrets

class SecureString:
    """A string that tries to clean up after itself."""
    
    def __init__(self, value: str):
        self._value = value
    
    def get(self) -> str:
        return self._value
    
    def __del__(self):
        # Overwrite memory (best effort, not guaranteed in Python)
        if hasattr(self, '_value'):
            self._value = crypto_secrets.token_hex(len(self._value))
    
    def __str__(self):
        return "[REDACTED]"
    
    def __repr__(self):
        return "SecureString([REDACTED])"

CI/CD Secrets

Your pipeline needs secrets too. Options:

GitHub Actions:

1
2
3
4
5
6
7
8
jobs:
  deploy:
    steps:
      - name: Deploy
        env:
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
        run: ./deploy.sh

Better—OIDC federation:

1
2
3
4
5
6
7
8
9
jobs:
  deploy:
    permissions:
      id-token: write
    steps:
      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789:role/GitHubActions
          aws-region: us-east-1

No long-lived credentials. GitHub proves its identity, AWS grants temporary access.

What to Use When

Personal projects / early startup:

.env files (gitignored)
Maybe SOPS for shared secrets

Growing team:

Cloud provider’s secret manager (AWS/GCP/Azure)
IAM-based access control
Basic rotation

Enterprise / compliance requirements:

HashiCorp Vault or enterprise secret manager
Fine-grained policies
Audit logging
Automated rotation
HSM backing for crypto keys

Common Mistakes

Logging Secrets

1
2
3
4
5
# DON'T
logger.info(f"Connecting with credentials: {username}:{password}")

# DO
logger.info(f"Connecting as user: {username}")

Sanitize logs. Secrets in logs end up in log aggregators, which end up in search indexes, which end up in breaches.

Secrets in Docker Images

1
2
3
4
5
# DON'T
ENV API_KEY=sk_live_abc123

# DO
# Pass at runtime: docker run -e API_KEY=$API_KEY myimage

Docker image layers are permanent. Even if you delete the ENV in a later layer, it’s still in the history.

Secrets in Error Messages

1
2
3
4
5
# DON'T
raise Exception(f"Failed to connect to {database_url}")

# DO
raise Exception(f"Failed to connect to database at {host}:{port}")

Error messages end up in bug trackers, error monitoring services, and support tickets.

Overly Broad Access

If everyone on the team can read production database credentials, you don’t have access control—you have shared passwords.

Principle of least privilege: developers get dev secrets, CI gets deployment secrets, production gets production secrets.

The Minimum Viable Security

If you do nothing else:

Never commit secrets to git—use .env files or a secrets manager
Use different secrets per environment—dev, staging, prod should never share credentials
Rotate when people leave—assume departing employees have copies
Log access, not content—know who accessed what, never log the actual secret

Secrets management isn’t exciting work. But a breach from a leaked API key is definitely exciting—in the worst way.

The best secret is one that doesn’t exist. The second best is one that’s properly managed.

The Problem Space#

Level 0: Don’t Do This#

Level 1: Environment Variables#

Level 2: Encrypted Secrets Files#

Level 3: Secrets Manager Services#

Practical Patterns#

The Startup Bootstrap Problem#

Caching Secrets#

Rotation Without Downtime#

Configuration vs Secrets#

The Twelve-Factor Approach#

CI/CD Secrets#

What to Use When#

Common Mistakes#

Logging Secrets#

Secrets in Docker Images#

Secrets in Error Messages#

Overly Broad Access#

The Minimum Viable Security#

📬 Get the Newsletter