You’ve seen this log line before:
Human readable. Grep-able. And completely useless for answering questions like “how many users had failed login attempts yesterday?” or “what’s the P95 response time for requests from the EU region?”
Plain text logs are write-only storage. Structured logs are queryable data.
What Structured Logging Looks Like
Same event, structured:
| |
Every field is addressable. You can filter, aggregate, correlate, and alert on any combination.
Implementation: Python
| |
Implementation: Node.js
| |
The Context Hierarchy
Good structured logs have context at multiple levels:
Application context (set once at startup):
- Service name, version, environment
- Host, container ID, region
Request context (set per request):
- Request ID, trace ID
- User ID, session ID
- Endpoint, method
Event context (set per log line):
- What happened
- Relevant metrics
- Error details
| |
Log Levels That Mean Something
Use levels consistently across your organization:
- DEBUG: Detailed diagnostic info. Volume too high for production.
- INFO: Normal operations. User actions, business events, state changes.
- WARN: Something unexpected but handled. Degraded service, retries.
- ERROR: Something failed and wasn’t recovered. Requires attention.
- FATAL: Process cannot continue. Immediate intervention needed.
The test: Can your on-call engineer filter to ERROR and know exactly what needs attention?
What NOT to Log
Sensitive data: Passwords, tokens, SSNs, full credit card numbers. Even in structured logs, PII creates compliance nightmares.
| |
Massive payloads: Request/response bodies belong in traces, not logs. A 5MB JSON blob in your logs destroys your log aggregator’s performance and your budget.
High-cardinality IDs as field names: Don’t do {user_123: "clicked"}. Do {user_id: 123, action: "clicked"}.
Correlation: The Killer Feature
The real power of structured logs appears when you correlate across services:
| |
One query — request_id = "req-789" — shows the entire request flow across all services. Without correlation IDs, you’re back to timestamp guessing and grep.
Log Aggregation
Structured logs need somewhere to go:
- ELK Stack (Elasticsearch, Logstash, Kibana): Self-hosted, powerful, complex
- Loki + Grafana: Prometheus-style labels, cost-effective
- Datadog/Splunk/New Relic: Managed, expensive, batteries included
- CloudWatch Logs Insights: If you’re already in AWS
The choice depends on volume, budget, and existing infrastructure. But any of these is infinitely better than SSH + grep.
Migration Path
You don’t have to rewrite everything at once:
Add a structured logger alongside existing logs. New code uses structured, old code continues working.
Add request context middleware. Even plain-text logs become more useful with consistent request IDs.
Identify high-value log points. Errors, authentication events, payments — structure these first.
Gradually convert hot paths. Each conversion improves queryability.
| |
Structured logging is one of those investments that pays dividends forever. Every new feature automatically becomes queryable. Every incident investigation becomes faster. Every metric you wish you had is probably already in your logs — you just need to be able to ask the question.
Stop writing logs for humans to read sequentially. Start writing logs for machines to query instantly.