Deploying code and releasing features are not the same thing. Treating them as identical creates unnecessary risk, slows down development, and makes rollbacks painful. Feature flags fix this.

The Problem with Deploy-Equals-Release

Traditional deployment pipelines work like this: code merges, tests pass, artifact builds, deployment happens, users see the change. It’s linear and fragile.

What happens when the feature works in staging but breaks in production? You roll back the entire deployment, potentially reverting unrelated fixes. What if you want to release to 5% of users first? You can’t — it’s all or nothing.

Feature flags break this coupling. You deploy code that’s invisible until you flip a switch.

How Feature Flags Work

At their simplest, feature flags are conditionals:

1
2
3
4
def get_checkout_page(user):
    if feature_flags.is_enabled("new_checkout", user):
        return render_new_checkout(user)
    return render_old_checkout(user)

The flag evaluation can be as simple as a config file or as sophisticated as a rules engine considering user attributes, percentages, and targeting rules.

Implementation Patterns

Boolean Flags (Kill Switches)

The simplest pattern. On or off, globally.

1
2
3
4
5
6
7
8
# config/flags.yaml
features:
  dark_mode: true
  experimental_search: false

# Usage
if config.features.dark_mode:
    apply_dark_theme()

Good for: Emergency kill switches, ops toggles, simple feature gates.

Percentage Rollouts

Release to a fraction of users, gradually increasing.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
import hashlib

def is_enabled_for_user(flag_name, user_id, percentage):
    # Deterministic: same user always gets same result
    hash_input = f"{flag_name}:{user_id}"
    hash_value = int(hashlib.md5(hash_input.encode()).hexdigest(), 16)
    return (hash_value % 100) < percentage

# Roll out to 10% of users
if is_enabled_for_user("new_algorithm", user.id, 10):
    return new_recommendation_algorithm(user)

The hash ensures consistency — user 12345 always sees the same experience until you change the percentage.

User Targeting

Enable for specific segments: beta testers, enterprise customers, internal users.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
class FeatureFlag:
    def __init__(self, name, rules):
        self.name = name
        self.rules = rules
    
    def is_enabled(self, context):
        for rule in self.rules:
            if rule.matches(context):
                return rule.enabled
        return False

# Configuration
new_dashboard = FeatureFlag("new_dashboard", [
    Rule(condition={"email_domain": "company.com"}, enabled=True),
    Rule(condition={"plan": "enterprise"}, enabled=True),
    Rule(condition={"user_id_in": [123, 456, 789]}, enabled=True),
    Rule(condition={}, enabled=False)  # Default: off
])

Environment-Based Flags

Different behavior per environment without code changes.

1
2
3
4
5
6
7
8
# Loaded from environment-specific config
FLAGS = {
    "development": {"debug_panel": True, "mock_payments": True},
    "staging": {"debug_panel": True, "mock_payments": False},
    "production": {"debug_panel": False, "mock_payments": False}
}

current_flags = FLAGS[os.environ.get("ENV", "development")]

A Minimal Flag Service

Here’s a lightweight implementation you can self-host:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
from flask import Flask, jsonify, request
from functools import lru_cache
import json
import time

app = Flask(__name__)

# In production, use Redis or a database
FLAGS_STORE = {}

@app.route("/flags/<flag_name>", methods=["GET"])
def evaluate_flag(flag_name):
    context = request.args.to_dict()
    flag = FLAGS_STORE.get(flag_name)
    
    if not flag:
        return jsonify({"enabled": False, "reason": "flag_not_found"})
    
    enabled = evaluate_rules(flag, context)
    return jsonify({"enabled": enabled, "flag": flag_name})

@app.route("/flags/<flag_name>", methods=["PUT"])
def update_flag(flag_name):
    FLAGS_STORE[flag_name] = request.json
    return jsonify({"status": "updated"})

def evaluate_rules(flag, context):
    # Percentage rollout
    if "percentage" in flag:
        user_id = context.get("user_id", "anonymous")
        return is_in_percentage(flag_name, user_id, flag["percentage"])
    
    # Simple boolean
    return flag.get("enabled", False)

Client SDK:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
import requests
from functools import lru_cache

class FeatureFlags:
    def __init__(self, base_url, cache_ttl=60):
        self.base_url = base_url
        self.cache_ttl = cache_ttl
        self._cache = {}
        self._cache_time = {}
    
    def is_enabled(self, flag_name, **context):
        cache_key = f"{flag_name}:{hash(frozenset(context.items()))}"
        
        # Check cache
        if cache_key in self._cache:
            if time.time() - self._cache_time[cache_key] < self.cache_ttl:
                return self._cache[cache_key]
        
        # Fetch from service
        try:
            response = requests.get(
                f"{self.base_url}/flags/{flag_name}",
                params=context,
                timeout=0.5  # Fast timeout — default to off if slow
            )
            enabled = response.json().get("enabled", False)
        except Exception:
            enabled = False  # Fail closed
        
        self._cache[cache_key] = enabled
        self._cache_time[cache_key] = time.time()
        return enabled

# Usage
flags = FeatureFlags("http://flags-service:8080")

if flags.is_enabled("new_checkout", user_id=user.id, plan=user.plan):
    show_new_checkout()

Operational Considerations

Fail Safe

When the flag service is down, what happens? Your code should have sensible defaults:

1
2
3
4
5
6
def is_enabled(self, flag_name, default=False, **context):
    try:
        return self._fetch_flag(flag_name, context)
    except Exception as e:
        log.warning(f"Flag service unavailable: {e}")
        return default

For new features, default to False (fail closed). For established features behind a kill switch, default to True (fail open).

Flag Lifecycle

Flags accumulate. Old flags become technical debt. Enforce hygiene:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
# Track flag age
FLAG_METADATA = {
    "new_checkout": {
        "created": "2026-01-15",
        "owner": "payments-team",
        "expires": "2026-04-01",  # Remove after full rollout
        "jira": "PAY-1234"
    }
}

# CI check: fail if expired flags exist in code
def check_expired_flags():
    today = datetime.now().date()
    for flag, meta in FLAG_METADATA.items():
        if datetime.strptime(meta["expires"], "%Y-%m-%d").date() < today:
            raise Exception(f"Flag {flag} expired on {meta['expires']}. Remove it.")

Testing with Flags

Test both paths. Always.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
@pytest.mark.parametrize("flag_enabled", [True, False])
def test_checkout_flow(flag_enabled, mocker):
    mocker.patch("features.is_enabled", return_value=flag_enabled)
    
    response = client.post("/checkout", data=valid_cart)
    
    assert response.status_code == 200
    if flag_enabled:
        assert "new-checkout-confirmation" in response.text
    else:
        assert "classic-checkout-confirmation" in response.text

When Not to Use Feature Flags

Flags add complexity. Don’t use them for:

  • Database migrations — Use proper migration tools
  • API versioning — Use URL versioning or headers
  • Simple config — Environment variables are simpler
  • Permanent differences — If it’s never going away, it’s not a flag

Key Takeaways

Feature flags transform deployment from a high-stakes event into routine infrastructure. They enable:

  1. Continuous deployment — Ship code daily, release features weekly
  2. Gradual rollouts — Catch problems at 1% instead of 100%
  3. Instant rollbacks — Flip a switch, not a deployment
  4. Experimentation — A/B test without separate infrastructure

Start simple. A JSON file and an if-statement. Graduate to a service when you need targeting rules and audit logs. The complexity should match your needs.


The best deployment is one nobody notices. Feature flags make that possible — code ships continuously while releases happen deliberately.