Feature flags started as if (ENABLE_NEW_UI) { ... }. They’ve evolved into a deployment strategy that separates code deployment from feature release. Ship your code Tuesday. Release to 1% of users Wednesday. Roll back without deploying Thursday.
Here’s how to implement feature flags that scale from simple toggles to sophisticated progressive delivery.
The Basic Pattern#
At its core, a feature flag is a runtime conditional:
1
2
3
4
5
| def get_recommendations(user_id: str) -> list:
if feature_flags.is_enabled("new_recommendation_algo", user_id):
return new_algorithm(user_id)
else:
return legacy_algorithm(user_id)
|
The magic is in how is_enabled works — and how you manage the flag lifecycle.
Flag Types#
Not all flags are created equal. Different purposes require different behaviors.
Release Flags (Short-lived)#
Gate incomplete features during development. Remove after rollout.
1
2
3
4
| # Release flag - remove after new_checkout is fully deployed
if flags.is_enabled("new_checkout_flow"):
return render_new_checkout(cart)
return render_legacy_checkout(cart)
|
Lifecycle: Created → Partial rollout → Full rollout → Remove from code
Experiment Flags (Measured)#
A/B tests with analytics integration.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| # Experiment flag - measures impact on conversion
variant = flags.get_variant("pricing_page_experiment", user_id)
# Returns: "control", "variant_a", or "variant_b"
if variant == "variant_a":
prices = apply_discount(prices, 0.10)
elif variant == "variant_b":
prices = apply_bundling(prices)
# Track for analysis
analytics.track("pricing_page_view", {
"user_id": user_id,
"variant": variant
})
|
Lifecycle: Hypothesis → Experiment → Analyze → Pick winner → Remove
Ops Flags (Kill Switches)#
Emergency controls for production incidents.
1
2
3
4
5
| # Kill switch - disable expensive feature during outage
if flags.is_enabled("enable_search_suggestions"):
suggestions = get_search_suggestions(query) # Expensive
else:
suggestions = [] # Graceful degradation
|
Lifecycle: Created → Lives forever (but rarely toggled)
Permission Flags (Entitlements)#
Control feature access by user tier or permissions.
1
2
3
4
5
6
7
| # Permission flag - premium feature
if flags.is_enabled("advanced_analytics", user_id, {
"plan": user.plan,
"organization": user.org_id
}):
return render_advanced_analytics(user_id)
return render_upgrade_prompt()
|
Lifecycle: Created → Lives as long as the pricing model exists
Progressive Rollout Strategies#
The real power of flags is controlled exposure.
Percentage Rollout#
1
2
3
4
5
6
7
8
9
10
11
12
13
| class PercentageRollout:
def __init__(self, flag_name: str, percentage: int):
self.flag_name = flag_name
self.percentage = percentage
def is_enabled(self, user_id: str) -> bool:
# Deterministic hash ensures same user always gets same result
hash_value = int(hashlib.md5(
f"{self.flag_name}:{user_id}".encode()
).hexdigest(), 16)
bucket = hash_value % 100
return bucket < self.percentage
|
Rollout schedule:
- Day 1: 1% (catch catastrophic bugs)
- Day 2: 5% (validate metrics)
- Day 3: 25% (stress test)
- Day 4: 50% (confidence building)
- Day 5: 100% (full release)
Sticky Assignment#
Users should see consistent behavior. Don’t flip features mid-session.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| class StickyFlags:
def __init__(self, cache: Redis):
self.cache = cache
def get_assignment(self, flag: str, user_id: str) -> bool:
key = f"flag:{flag}:{user_id}"
# Check for existing assignment
cached = self.cache.get(key)
if cached is not None:
return cached == "1"
# Calculate new assignment
enabled = self._calculate_enabled(flag, user_id)
# Store with TTL matching experiment duration
self.cache.setex(key, 86400 * 30, "1" if enabled else "0")
return enabled
|
Ring-Based Deployment#
Deploy to progressively larger “rings” of users:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
| RINGS = {
"canary": ["internal_qa@company.com", "dogfood@company.com"],
"early_adopters": lambda u: u.opted_into_beta,
"general": lambda u: True
}
def get_active_ring(flag_name: str) -> str:
# Controlled by flag service
return flag_config[flag_name].get("active_ring", "canary")
def is_enabled(flag_name: str, user: User) -> bool:
active_ring = get_active_ring(flag_name)
for ring_name, ring_check in RINGS.items():
if isinstance(ring_check, list):
if user.email in ring_check:
return True
elif ring_check(user):
return True
if ring_name == active_ring:
break
return False
|
Targeting Rules#
Complex conditions for sophisticated rollouts:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
| # LaunchDarkly-style targeting rules
flag_config = {
"new_dashboard": {
"rules": [
{
# Always enable for internal users
"if": {"attribute": "email", "endsWith": "@company.com"},
"serve": True
},
{
# Enable for enterprise tier in US
"if": {
"and": [
{"attribute": "plan", "equals": "enterprise"},
{"attribute": "country", "in": ["US", "CA"]}
]
},
"serve": True
},
{
# 20% of remaining users
"percentage": 20,
"serve": True
}
],
"default": False
}
}
|
Implementation Patterns#
SDK Structure#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
| from dataclasses import dataclass
from typing import Any, Dict, Optional
import hashlib
import time
@dataclass
class FlagEvaluation:
enabled: bool
variant: Optional[str]
reason: str # "rule_match", "percentage", "default"
timestamp: float
class FeatureFlags:
def __init__(self, config_source, cache=None):
self.config = config_source
self.cache = cache
self._local_cache = {}
self._cache_ttl = 60 # Refresh config every minute
def is_enabled(
self,
flag_name: str,
user_id: str = None,
context: Dict[str, Any] = None
) -> bool:
evaluation = self.evaluate(flag_name, user_id, context)
return evaluation.enabled
def evaluate(
self,
flag_name: str,
user_id: str = None,
context: Dict[str, Any] = None
) -> FlagEvaluation:
config = self._get_flag_config(flag_name)
if config is None:
return FlagEvaluation(False, None, "not_found", time.time())
# Check kill switch
if config.get("killed"):
return FlagEvaluation(False, None, "killed", time.time())
# Evaluate targeting rules
context = context or {}
context["user_id"] = user_id
for rule in config.get("rules", []):
if self._matches_rule(rule, context):
return FlagEvaluation(
rule["serve"],
rule.get("variant"),
"rule_match",
time.time()
)
# Percentage rollout
if "percentage" in config and user_id:
if self._in_percentage(flag_name, user_id, config["percentage"]):
return FlagEvaluation(True, None, "percentage", time.time())
# Default
return FlagEvaluation(
config.get("default", False),
None,
"default",
time.time()
)
def _in_percentage(self, flag: str, user_id: str, pct: int) -> bool:
hash_input = f"{flag}:{user_id}"
hash_value = int(hashlib.sha256(hash_input.encode()).hexdigest(), 16)
return (hash_value % 100) < pct
|
Avoiding Flag Debt#
Flags accumulate. Old flags become code archaeology.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
| # Flag metadata for lifecycle management
FLAG_REGISTRY = {
"new_checkout": {
"owner": "payments-team",
"created": "2024-01-15",
"expected_removal": "2024-02-15",
"jira": "PAY-1234",
"type": "release"
}
}
def audit_flags():
"""Run weekly to catch stale flags"""
stale = []
for flag, meta in FLAG_REGISTRY.items():
if meta["type"] == "release":
expected = datetime.fromisoformat(meta["expected_removal"])
if datetime.now() > expected:
stale.append({
"flag": flag,
"owner": meta["owner"],
"overdue_days": (datetime.now() - expected).days
})
if stale:
send_slack_alert(f"Stale flags need cleanup: {stale}")
|
Testing with Flags#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
| import pytest
from unittest.mock import patch
class TestCheckoutFlow:
def test_new_checkout_enabled(self, client):
with patch.object(feature_flags, 'is_enabled', return_value=True):
response = client.post('/checkout', json=cart_data)
assert response.json()["version"] == "v2"
def test_new_checkout_disabled(self, client):
with patch.object(feature_flags, 'is_enabled', return_value=False):
response = client.post('/checkout', json=cart_data)
assert response.json()["version"] == "v1"
def test_both_paths_produce_same_result(self, client, cart_data):
"""Verify feature parity before removing old path"""
with patch.object(feature_flags, 'is_enabled', return_value=True):
new_result = client.post('/checkout', json=cart_data).json()
with patch.object(feature_flags, 'is_enabled', return_value=False):
old_result = client.post('/checkout', json=cart_data).json()
assert new_result["total"] == old_result["total"]
assert new_result["items"] == old_result["items"]
|
Observability#
Flags without metrics are flying blind.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
| from prometheus_client import Counter, Histogram
flag_evaluations = Counter(
'feature_flag_evaluations_total',
'Feature flag evaluations',
['flag_name', 'enabled', 'reason']
)
flag_evaluation_duration = Histogram(
'feature_flag_evaluation_seconds',
'Time to evaluate feature flag',
['flag_name']
)
def evaluate_with_metrics(flag_name: str, user_id: str) -> bool:
with flag_evaluation_duration.labels(flag_name).time():
result = flags.evaluate(flag_name, user_id)
flag_evaluations.labels(
flag_name=flag_name,
enabled=str(result.enabled),
reason=result.reason
).inc()
return result.enabled
|
Dashboard queries:
1
2
3
4
5
6
7
8
9
| # Flag adoption over time
sum(rate(feature_flag_evaluations_total{flag_name="new_checkout", enabled="true"}[5m]))
/
sum(rate(feature_flag_evaluations_total{flag_name="new_checkout"}[5m]))
# Flags with high evaluation latency
histogram_quantile(0.99,
rate(feature_flag_evaluation_seconds_bucket[5m])
) > 0.01
|
The Managed Services#
For production use, consider dedicated services:
| Service | Strengths | Pricing |
|---|
| LaunchDarkly | Full-featured, great SDKs | $$$ |
| Split.io | Strong experimentation | $$ |
| Flagsmith | Open-source option | $ |
| Unleash | Self-hosted, open-source | Free |
| AWS AppConfig | AWS-native, simple | $ |
Roll your own only if you have specific requirements the services don’t meet.
The Lifecycle#
- Create — Define flag with owner, expected duration, success metrics
- Implement — Add flag check in code, test both paths
- Roll out — Progressive percentage increase with monitoring
- Measure — Analyze metrics, compare variants
- Decide — Full enable or revert
- Clean up — Remove flag from code, delete configuration
The cleanup step is where most teams fail. A flag at 100% for 6 months isn’t a flag — it’s dead code with extra steps.
Feature flags turn deployments from “hope it works” into “let’s find out safely.” The investment in flag infrastructure pays back every time you catch a bug at 1% instead of 100%.
Ship code fearlessly. Release features carefully.