Logging

Logging Levels: A Practical Guide to What Goes Where

Logging seems simple until you’re debugging production at 2 AM, scrolling through millions of lines trying to find the one that matters. Good logging practices make that experience less painful. Here’s how to think about log levels. The Levels Most logging frameworks use these standard levels: D E B U G < I N F O < W A R N < E R R O R < F A T A L In production, you typically run at INFO or WARN. Lower levels include all higher levels (INFO includes WARN, ERROR, and FATAL). ...

Structured Logging: Stop Parsing Log Lines

Unstructured logs are technical debt. Structured logs are queryable, parseable, and actually useful when things break. The Problem # 2 2 2 0 0 0 U 2 2 2 n 6 6 6 s - - - t 0 0 0 r 2 2 2 u - - - c 2 2 2 t 8 8 8 u r 1 1 1 e 0 0 0 d : : : : 1 1 1 5 5 5 g : : : o 2 2 2 o 3 4 5 d I E I l N R N u F R F c O O O k R U R p s F e a e a q r r i u s l e i a e s n l d t g i c t c t e o o h m i l p p s o r l g o e g c t e e e d s d s i i n o n r f d 2 r e 3 o r 4 m m 1 s 1 2 9 3 2 4 . 5 1 : 6 8 c . o 1 n . n 1 e c t i o n t i m e o u t Regex hell when you need to extract user, IP, order ID, or duration. ...

Structured Logging: Stop Grepping, Start Querying

Unstructured logs are a trap. They look simple until you need to find something. [ [ [ 2 2 2 0 0 0 2 2 2 6 6 6 - - - 0 0 0 2 2 2 - - - 2 2 2 7 7 7 0 0 0 5 5 5 : : : 3 3 3 0 0 0 : : : 1 1 1 5 6 7 ] ] ] I E W N R A F R R O O N R U H s F i e a g r i h l j e m o d e h m n t o @ o r e y x p a r u m o s p c a l e g e s e . s c d o o e m r t d e l e c o r t g e g 1 d e 2 : d 3 4 8 i 5 7 n : % f c r o o n m n e 1 c 9 t 2 i . o 1 n 6 8 t . i 1 m . e 5 o 0 u t Quick: find all login failures from a specific IP range in the last hour. Now try parsing the order ID from error messages. Hope you enjoy regex. ...

journalctl: Querying Systemd Logs

systemd’s journal collects logs from all services, the kernel, and system messages in one place. journalctl is your tool for searching, filtering, and following those logs. Basic Usage 1 2 3 4 5 6 7 8 9 10 11 # Show all logs (oldest first) journalctl # Show all logs (newest first) journalctl -r # Follow new entries (like tail -f) journalctl -f # Show only errors and above journalctl -p err Filter by Time 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 # Since boot journalctl -b # Previous boot journalctl -b -1 # Since specific time journalctl --since "2024-02-25 10:00:00" # Until specific time journalctl --until "2024-02-25 12:00:00" # Time range journalctl --since "1 hour ago" journalctl --since "2024-02-25" --until "2024-02-26" # Relative times journalctl --since "yesterday" journalctl --since "10 minutes ago" Filter by Unit (Service) 1 2 3 4 5 6 7 8 9 10 11 # Specific service journalctl -u nginx # Multiple services journalctl -u nginx -u php-fpm # Follow specific service journalctl -u nginx -f # Service since boot journalctl -u nginx -b Filter by Priority Priority levels (0=emergency to 7=debug): ...

Log Aggregation Pipelines: From Scattered Files to Searchable Insights

When you have one server, you SSH in and grep the logs. When you have fifty servers, that stops working. Log aggregation is how you make “what happened?” answerable at scale. The Pipeline Architecture Every log aggregation system follows the same basic pattern: ┌ │ └ ─ ─ ─ S ─ ─ o ─ ─ u ─ ─ r ─ │ │ └ ─ c ─ ─ ─ e ─ ─ ─ s ─ ─ ─ ─ ─ ┐ │ ┘ ─ ─ ─ ─ ─ ─ ─ ▶ ▶ ┌ │ └ ┌ │ └ ─ ─ ─ ─ ─ C ─ ─ ─ ─ o ─ ─ Q ─ ─ l ─ ─ u ─ ─ l ─ ─ e ─ ─ e ─ ─ r ─ ─ c ─ ─ y ─ ─ t ─ ─ ─ ─ ─ ─ ─ ┐ │ ┘ ┐ │ ┘ ─ ◀ ─ ─ ─ ─ ▶ ─ ┌ │ └ ─ ─ ─ ─ ─ P ─ ─ ─ r ─ ─ ─ o ─ ─ ─ c ─ ─ ─ e ─ ─ ─ s ─ ─ ─ s ─ ─ ─ ─ ─ ┐ │ ┘ ─ ─ ─ ─ ─ ─ ─ ▶ ─ ┌ │ └ ─ ─ ─ ─ ─ ─ ─ ─ S ─ ─ ─ t ─ ─ ─ o ─ │ ┘ ─ r ─ │ ─ e ─ ─ ─ ─ ─ ┐ │ ┘ Each stage has choices. Let’s walk through them. ...

Structured Logging: Stop Grepping, Start Querying

You’ve seen this log line before: 2 0 2 6 - 0 2 - 2 3 0 5 : 3 0 : 0 0 I N F O U s e r j o h n @ e x a m p l e . c o m l o g g e d i n f r o m 1 9 2 . 1 6 8 . 1 . 1 0 0 a f t e r 2 f a i l e d a t t e m p t s Human readable. Grep-able. And completely useless for answering questions like “how many users had failed login attempts yesterday?” or “what’s the P95 response time for requests from the EU region?” ...

Observability Pipelines: From Logs to Insights

Raw logs are noise. Processed telemetry is intelligence. The difference between them is your observability pipeline. Modern distributed systems generate enormous amounts of data—logs, metrics, traces, events. But data isn’t insight. The challenge isn’t collection; it’s transformation. How do you turn a firehose of JSON lines into something a human (or an AI) can actually act on? The Three Pillars, Unified You’ve heard the “three pillars of observability”: logs, metrics, and traces. What’s often missing from that conversation is how these pillars should connect. ...

Structured Logging: Making Logs Queryable and Actionable

Plain text logs are for humans. Structured logs are for machines. In production, machines need to read your logs before humans do. When your service handles thousands of requests per second, grep stops working. You need logs that can be indexed, queried, aggregated, and alerted on. That means structure. The Problem with Text Logs [ [ [ 2 2 2 0 0 0 2 2 2 6 6 6 - - - 0 0 0 2 2 2 - - - 1 1 1 6 6 6 0 0 0 8 8 8 : : : 3 3 3 0 0 0 : : : 1 1 1 5 6 7 ] ] ] I E W N R A F R R O O N : R : : U H s P i e a g r y h m j e m o n e h t m n o @ f r e a y x i a l u m e s p d a l g e f e . o c r d o e m o t r e l d c o e t g r e g d e 1 : d 2 3 8 i 4 7 n 5 % f - r o i m n s 1 u 9 f 2 f . i 1 c 6 i 8 e . n 1 t . 5 f 0 u n d s Looks readable. But try answering: ...

Structured Logging for Distributed Systems

When your application spans multiple services, containers, and regions, print("something went wrong") doesn’t cut it anymore. Structured logging transforms your logs from walls of text into queryable data. Why Structured Logging? Traditional logs are strings meant for humans: [ 2 0 2 6 - 0 2 - 1 3 1 4 : 0 0 : 0 0 ] E R R O R : F a i l e d t o p r o c e s s o r d e r 1 2 3 4 5 f o r u s e r j o h n @ e x a m p l e . c o m Structured logs are data meant for machines (and humans): ...

Observability: Beyond Monitoring with Metrics, Logs, and Traces

Monitoring tells you when something is wrong. Observability helps you understand why. In distributed systems, you can’t predict every failure mode—you need systems that let you ask arbitrary questions about their behavior. The Three Pillars Metrics: What’s Happening Now Numeric time-series data. Fast to query, cheap to store. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 from prometheus_client import Counter, Histogram, Gauge, start_http_server # Counter - only goes up requests_total = Counter( 'http_requests_total', 'Total HTTP requests', ['method', 'endpoint', 'status'] ) # Histogram - distribution of values request_duration = Histogram( 'http_request_duration_seconds', 'Request duration in seconds', ['method', 'endpoint'], buckets=[.01, .05, .1, .25, .5, 1, 2.5, 5, 10] ) # Gauge - can go up or down active_connections = Gauge( 'active_connections', 'Number of active connections' ) # Usage @app.route("/api/<endpoint>") def handle_request(endpoint): active_connections.inc() with request_duration.labels( method=request.method, endpoint=endpoint ).time(): result = process_request() requests_total.labels( method=request.method, endpoint=endpoint, status=200 ).inc() active_connections.dec() return result # Expose metrics endpoint start_http_server(9090) Logs: What Happened Discrete events with context. Rich detail, expensive at scale. ...