Your API will be called wrong. Clients will send garbage. Load will spike unexpectedly. Authentication will be misconfigured. The question isn’t whether these things happen — it’s whether your API degrades gracefully or explodes.
Here’s how to build APIs that survive contact with the real world.
Every field, every header, every query parameter is hostile until proven otherwise.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
| from pydantic import BaseModel, Field, validator
from typing import Optional
import re
class CreateUserRequest(BaseModel):
email: str = Field(..., max_length=254)
name: str = Field(..., min_length=1, max_length=100)
age: Optional[int] = Field(None, ge=0, le=150)
@validator('email')
def validate_email(cls, v):
# Don't just regex — actually validate structure
if not re.match(r'^[^@]+@[^@]+\.[^@]+$', v):
raise ValueError('Invalid email format')
return v.lower().strip()
@validator('name')
def sanitize_name(cls, v):
# Remove control characters, normalize whitespace
v = re.sub(r'[\x00-\x1f\x7f-\x9f]', '', v)
return ' '.join(v.split())
|
Key principles:
- Set maximum lengths on everything (prevent memory exhaustion)
- Validate format, not just presence
- Normalize inputs (lowercase emails, trim whitespace)
- Reject clearly impossible values (age > 150)
Rate Limiting: Layers of Defense#
A single rate limit isn’t enough. You need multiple layers:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
| from fastapi import Request, HTTPException
from slowapi import Limiter
from slowapi.util import get_remote_address
limiter = Limiter(key_func=get_remote_address)
# Layer 1: Global rate limit per IP
@app.middleware("http")
async def global_rate_limit(request: Request, call_next):
# 1000 requests per minute per IP
...
# Layer 2: Endpoint-specific limits
@app.post("/api/expensive-operation")
@limiter.limit("10/minute") # Expensive operations get tighter limits
async def expensive_operation():
...
# Layer 3: User-based limits (after auth)
@app.post("/api/send-email")
@limiter.limit("5/hour", key_func=get_user_id) # Per-user, not per-IP
async def send_email():
...
# Layer 4: Resource-based limits
@app.post("/api/projects/{project_id}/builds")
@limiter.limit("20/hour", key_func=lambda r: r.path_params["project_id"])
async def trigger_build(project_id: str):
...
|
Return useful headers:
1
2
3
4
| response.headers["X-RateLimit-Limit"] = "100"
response.headers["X-RateLimit-Remaining"] = "73"
response.headers["X-RateLimit-Reset"] = "1640000000"
response.headers["Retry-After"] = "60" # On 429s
|
Idempotency: Safe Retries#
Network failures happen. Clients will retry. Your API should handle duplicates gracefully.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
| import hashlib
from fastapi import Header
@app.post("/api/payments")
async def create_payment(
request: PaymentRequest,
idempotency_key: str = Header(..., alias="Idempotency-Key")
):
# Check if we've seen this key before
cached = await redis.get(f"idempotency:{idempotency_key}")
if cached:
return json.loads(cached) # Return same response
# Process the payment
result = await process_payment(request)
# Cache the result (expire after 24 hours)
await redis.setex(
f"idempotency:{idempotency_key}",
86400,
json.dumps(result)
)
return result
|
Rules for idempotency keys:
- Client generates the key (UUID or hash of request)
- Same key + same endpoint = same response
- Keys expire (24h is common)
- Different endpoints can reuse keys
Circuit Breakers: Fail Fast#
When downstream services fail, don’t let failures cascade.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| import pybreaker
# Circuit opens after 5 failures, stays open for 30 seconds
db_breaker = pybreaker.CircuitBreaker(
fail_max=5,
reset_timeout=30
)
@app.get("/api/users/{user_id}")
async def get_user(user_id: str):
try:
return db_breaker.call(fetch_user_from_db, user_id)
except pybreaker.CircuitBreakerError:
# Circuit is open — don't even try
raise HTTPException(
status_code=503,
detail="Service temporarily unavailable",
headers={"Retry-After": "30"}
)
|
What to protect:
- Database connections
- Third-party API calls
- Cache lookups (if cache failure shouldn’t block requests)
- Any I/O that can timeout
Request Timeouts: Kill Slow Requests#
Slow requests tie up resources. Set timeouts aggressively.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| import asyncio
from fastapi import HTTPException
async def with_timeout(coro, seconds: float):
try:
return await asyncio.wait_for(coro, timeout=seconds)
except asyncio.TimeoutError:
raise HTTPException(
status_code=504,
detail=f"Request timed out after {seconds}s"
)
@app.get("/api/search")
async def search(q: str):
# Don't let searches run forever
return await with_timeout(
perform_search(q),
seconds=5.0
)
|
Timeout guidelines:
- Simple reads: 1-5 seconds
- Complex queries: 10-30 seconds
- Background jobs: use async processing instead
- Total request timeout should be less than client timeout
Error Responses: Helpful Without Leaking#
Errors should help legitimate clients debug issues without revealing internals to attackers.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
| from fastapi import HTTPException
from pydantic import BaseModel
from typing import Optional, List
class ErrorDetail(BaseModel):
code: str # Machine-readable error code
message: str # Human-readable message
field: Optional[str] = None # Which field caused the error
class ErrorResponse(BaseModel):
error: str # Top-level error type
message: str # Summary
details: List[ErrorDetail] = []
request_id: str # For support tickets
# Good error response
{
"error": "validation_error",
"message": "Invalid request parameters",
"details": [
{"code": "invalid_email", "message": "Email format is invalid", "field": "email"},
{"code": "required", "message": "This field is required", "field": "name"}
],
"request_id": "req_abc123"
}
# Bad error response (leaks internals)
{
"error": "SQLException: ORA-00942: table or view does not exist",
"stack_trace": "..."
}
|
Graceful Degradation: Partial Success#
When some data is unavailable, return what you can.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
| @app.get("/api/dashboard")
async def get_dashboard():
results = {}
errors = []
# Try to fetch each component independently
try:
results["user"] = await fetch_user_data()
except Exception as e:
errors.append({"component": "user", "error": "unavailable"})
results["user"] = None
try:
results["notifications"] = await fetch_notifications()
except Exception as e:
errors.append({"component": "notifications", "error": "unavailable"})
results["notifications"] = []
return {
"data": results,
"partial": len(errors) > 0,
"errors": errors
}
|
Request Signing: Verify Integrity#
For sensitive operations, verify the request hasn’t been tampered with.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
| import hmac
import hashlib
def verify_signature(payload: bytes, signature: str, secret: str) -> bool:
expected = hmac.new(
secret.encode(),
payload,
hashlib.sha256
).hexdigest()
return hmac.compare_digest(signature, expected)
@app.post("/api/webhooks/payment")
async def payment_webhook(
request: Request,
x_signature: str = Header(...)
):
body = await request.body()
if not verify_signature(body, x_signature, WEBHOOK_SECRET):
raise HTTPException(status_code=401, detail="Invalid signature")
# Process webhook...
|
The Checklist#
Before shipping any API endpoint:
Your API will be abused. Build it assuming the worst, and it’ll handle normal traffic with ease.