Configuration Management Principles: Making Deployments Predictable

Most production incidents I’ve debugged came down to configuration. A missing environment variable. A wrong database URL. A feature flag stuck in the wrong state. Code was fine; configuration was the problem.

Configuration management is the unsexy work that prevents those 3 AM pages.

The Core Principles

1. Separate Configuration from Code

Configuration should never be baked into your application binary or container image.

Wrong:

1
2
# Hardcoded in code
DATABASE_URL = "postgres://prod:password@db.example.com/myapp"

Also wrong:

1
2
# Baked into image
ENV DATABASE_URL="postgres://prod:password@db.example.com/myapp"

Right:

1
2
# Read at runtime
DATABASE_URL = os.environ.get("DATABASE_URL")

1
2
# Injected at deployment
docker run -e DATABASE_URL="..." myapp:1.0

Why? The same image should run in dev, staging, and production. Only configuration differs.

2. Validate Configuration at Startup

Fail fast if configuration is missing or invalid:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
from pydantic import BaseSettings, validator

class Settings(BaseSettings):
    database_url: str
    redis_url: str
    api_key: str
    debug: bool = False
    max_connections: int = 10
    
    @validator("max_connections")
    def validate_connections(cls, v):
        if v < 1 or v > 100:
            raise ValueError("max_connections must be between 1 and 100")
        return v
    
    class Config:
        env_file = ".env"

# This runs at import time - app won't start with bad config
settings = Settings()

A clear error at startup beats a cryptic error at 3 AM when that code path finally runs.

3. Make Configuration Explicit

Document every configuration option:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# config.example.yaml
# Database connection string (required)
# Format: postgres://user:password@host:port/database
database_url: ""

# Redis URL for caching (required)
redis_url: ""

# Enable debug mode (optional, default: false)
# WARNING: Never enable in production
debug: false

# Maximum concurrent database connections (optional, default: 10)
# Range: 1-100
max_connections: 10

If it’s not documented, someone will misconfigure it.

4. Use Hierarchical Configuration

Configuration should layer: defaults → environment-specific → instance-specific → overrides.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
import os
from pathlib import Path
import yaml

def load_config():
    config = {}
    
    # 1. Load defaults
    config.update(load_yaml("config/defaults.yaml"))
    
    # 2. Load environment-specific
    env = os.getenv("APP_ENV", "development")
    env_file = Path(f"config/{env}.yaml")
    if env_file.exists():
        config.update(load_yaml(env_file))
    
    # 3. Load local overrides (gitignored)
    local_file = Path("config/local.yaml")
    if local_file.exists():
        config.update(load_yaml(local_file))
    
    # 4. Environment variables override everything
    for key in config:
        env_key = f"APP_{key.upper()}"
        if env_key in os.environ:
            config[key] = os.environ[env_key]
    
    return config

This lets you have sensible defaults while allowing specific overrides.

5. Version Your Configuration

Configuration changes should be tracked like code changes:

For secrets, use a separate system (Vault, AWS Secrets Manager) and reference them:

1
2
# production.yaml
database_password: "${vault:secret/prod/db#password}"

Common Patterns

Environment Variables

The twelve-factor standard. Simple, universal, works everywhere.

1
2
3
export DATABASE_URL="postgres://..."
export REDIS_URL="redis://..."
export LOG_LEVEL="info"

Pros:

Works in any language
Easy to inject in containers, CI, etc.
No files to manage

Cons:

No structure (everything is a string)
Hard to see “all configuration” at once
Can leak in logs, process listings

Best for: Simple applications, container deployments, cloud-native environments.

Configuration Files

YAML, JSON, TOML, INI — structured configuration in files.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# config.yaml
server:
  host: 0.0.0.0
  port: 8080
  workers: 4

database:
  url: postgres://localhost/myapp
  pool_size: 10

features:
  new_checkout: true
  beta_users: ["user1", "user2"]

Pros:

Structured, supports nesting
Easy to read and edit
Can be templated

Cons:

Files need to be deployed
Requires parsing logic
Format choice can be contentious (YAML vs TOML debates)

Best for: Complex configuration, local development, applications with many options.

Remote Configuration

Fetch configuration from a central service (Consul, etcd, AWS Parameter Store).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
import boto3

ssm = boto3.client('ssm')

def get_config():
    response = ssm.get_parameters_by_path(
        Path="/myapp/production/",
        WithDecryption=True
    )
    
    config = {}
    for param in response["Parameters"]:
        key = param["Name"].split("/")[-1]
        config[key] = param["Value"]
    
    return config

Pros:

Centralized management
Can update without redeployment
Built-in encryption for secrets

Cons:

Adds external dependency
Network latency at startup
Needs caching strategy

Best for: Microservices, multi-environment deployments, dynamic configuration.

Feature Flags

A special case of configuration: controlling feature rollout.

1
2
3
4
5
6
7
8
9
# Simple approach
if config.features.get("new_checkout"):
    return new_checkout_flow()
else:
    return old_checkout_flow()

# With targeting
if feature_client.is_enabled("new_checkout", user_id=user.id):
    return new_checkout_flow()

Feature flag systems (LaunchDarkly, Unleash, Flagsmith) add:

Percentage rollouts (10% of users see new feature)
User targeting (beta users, specific accounts)
A/B testing (track outcomes per variant)
Kill switches (disable instantly without deploy)

Rule: Feature flags are temporary. Remove them after rollout is complete. Stale flags become technical debt.

Anti-Patterns

Configuration Drift

Environment configurations diverge over time:

Staging has different settings than production
One server has an override someone added manually
Nobody knows the “canonical” configuration

Fix: Infrastructure as Code. Terraform, Ansible, or Kubernetes manifests define the configuration. Drift is detected and corrected automatically.

Secret Sprawl

Secrets copied into:

Environment files
CI/CD configurations
Developer laptops
Slack messages

Fix: Centralized secret management. Vault, AWS Secrets Manager, or similar. Secrets are fetched at runtime, never stored in files or repositories.

The God Config

One massive configuration object passed everywhere:

1
2
3
4
def handle_request(config, request):
    db = connect(config.database_url)
    cache = connect(config.redis_url)
    # config has 200 fields, this function uses 3

Fix: Inject only what’s needed. Use dependency injection or specific configuration objects:

1
2
def handle_request(db: Database, cache: Cache, request):
    # Function declares its actual dependencies

Configuration as Logic

1
2
3
4
5
6
# Don't do this
rules:
  - condition: "user.age > 18 AND user.country == 'US'"
    action: "allow"
  - condition: "user.subscription == 'premium'"
    action: "allow"

You’ve invented a programming language in YAML. This is hard to test, hard to debug, and hard to reason about.

Fix: Keep logic in code. Use configuration for simple values, not business rules.

Testing Configuration

Validate in CI

1
2
3
4
5
# .github/workflows/ci.yml
- name: Validate configuration
  run: |
    python -c "from config import Settings; Settings()"
    yamllint config/*.yaml

Catch typos and missing values before deployment.

Environment Parity

Test with production-like configuration:

1
2
3
4
5
6
7
8
def test_with_production_config():
    # Load production config (with secrets stubbed)
    config = load_config("production", secrets=mock_secrets)
    
    app = create_app(config)
    
    # Run integration tests
    ...

If tests pass with different configuration than production, they’re lying to you.

Configuration Diff on Deploy

1
2
3
4
5
6
7
8
9
# Show what's changing
diff <(kubectl get configmap myapp -o yaml) new-configmap.yaml

# Require approval for production config changes
if [ "$ENV" = "production" ]; then
    echo "Configuration changes:"
    diff ...
    read -p "Apply? [y/N] " confirm
fi

Make configuration changes visible and intentional.

Configuration is the connective tissue between your code and the environment it runs in. Treat it with the same care you’d give to code: version it, validate it, test it, and document it. Your future self — the one who’s not debugging at 3 AM — will thank you.

The Core Principles#

1. Separate Configuration from Code#

2. Validate Configuration at Startup#

3. Make Configuration Explicit#

4. Use Hierarchical Configuration#

5. Version Your Configuration#

Common Patterns#

Environment Variables#

Configuration Files#

Remote Configuration#

Feature Flags#

Anti-Patterns#

Configuration Drift#

Secret Sprawl#

The God Config#

Configuration as Logic#

Testing Configuration#

Validate in CI#

Environment Parity#

Configuration Diff on Deploy#

📬 Get the Newsletter

The Core Principles

1. Separate Configuration from Code

2. Validate Configuration at Startup

3. Make Configuration Explicit

4. Use Hierarchical Configuration

5. Version Your Configuration

Common Patterns

Environment Variables

Configuration Files

Remote Configuration

Feature Flags

Anti-Patterns

Configuration Drift

Secret Sprawl

The God Config

Configuration as Logic

Testing Configuration

Validate in CI

Environment Parity

Configuration Diff on Deploy