Plain text logs are for humans. Structured logs are for machines. In production, machines need to read your logs before humans do.

When your service handles thousands of requests per second, grep stops working. You need logs that can be indexed, queried, aggregated, and alerted on. That means structure.

The Problem with Text Logs

[[[222000222666---000222---111666000888:::333000:::111567]]]IEWNRAFRROON:R::UHsPieagryhmjemonehtmno@freayxialumespdalgefe.ocrdoemotreldcoetgregde1:d238i47n5%f-roimns1u9f2f.i1c6i8e.n1t.5f0unds

Looks readable. But try answering:

  • How many login failures in the last hour?
  • Which users had payment failures this week?
  • What’s the average response time for the /api/orders endpoint?

You can’t—not without regex gymnastics and manual aggregation.

Structured Logging

Same events, structured:

1
2
3
{"timestamp": "2026-02-16T08:30:15Z", "level": "info", "event": "user_login", "user_email": "john@example.com", "ip": "192.168.1.50", "success": true}
{"timestamp": "2026-02-16T08:30:16Z", "level": "error", "event": "payment_failed", "order_id": "12345", "reason": "insufficient_funds", "user_id": "user_789"}
{"timestamp": "2026-02-16T08:30:17Z", "level": "warn", "event": "high_memory", "memory_percent": 87, "threshold": 80}

Now you can query:

1
2
3
4
5
6
7
8
9
-- Login failures last hour
SELECT COUNT(*) FROM logs 
WHERE event = 'user_login' AND success = false 
AND timestamp > NOW() - INTERVAL '1 hour'

-- Payment failures by reason
SELECT reason, COUNT(*) FROM logs 
WHERE event = 'payment_failed' 
GROUP BY reason

Implementation

Node.js with Pino

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
const pino = require('pino');

const logger = pino({
  level: process.env.LOG_LEVEL || 'info',
  formatters: {
    level: (label) => ({ level: label }),
  },
  timestamp: pino.stdTimeFunctions.isoTime,
});

// Usage
logger.info({ 
  event: 'user_login',
  userId: user.id,
  email: user.email,
  ip: req.ip,
  userAgent: req.headers['user-agent'],
}, 'User logged in');

logger.error({
  event: 'payment_failed',
  orderId: order.id,
  userId: user.id,
  amount: order.total,
  reason: error.code,
  errorMessage: error.message,
}, 'Payment processing failed');

Python with structlog

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
import structlog

logger = structlog.get_logger()

# Configure once at startup
structlog.configure(
    processors=[
        structlog.processors.TimeStamper(fmt="iso"),
        structlog.processors.JSONRenderer()
    ]
)

# Usage
logger.info("user_login", 
    user_id=user.id, 
    email=user.email, 
    ip=request.remote_addr
)

logger.error("payment_failed",
    order_id=order.id,
    user_id=user.id,
    reason=str(error),
    exc_info=True
)

Go with zerolog

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
import "github.com/rs/zerolog/log"

// Usage
log.Info().
    Str("event", "user_login").
    Str("user_id", user.ID).
    Str("email", user.Email).
    Str("ip", r.RemoteAddr).
    Msg("User logged in")

log.Error().
    Str("event", "payment_failed").
    Str("order_id", order.ID).
    Str("reason", err.Error()).
    Msg("Payment processing failed")

Essential Fields

Every log entry should include:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
{
  "timestamp": "2026-02-16T08:30:15.123Z",
  "level": "info",
  "service": "payment-api",
  "version": "1.2.3",
  "environment": "production",
  "trace_id": "abc123",
  "span_id": "def456",
  "event": "payment_processed",
  // ... event-specific fields
}

Always include:

  • timestamp — ISO 8601 format with timezone
  • level — Severity (debug, info, warn, error, fatal)
  • service — Which service generated this log
  • environment — prod, staging, dev
  • trace_id — For distributed tracing correlation

Include when available:

  • user_id — Who triggered this action
  • request_id — Correlate logs within a request
  • duration_ms — For performance tracking

Log Levels Done Right

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
// DEBUG: Detailed diagnostic information
logger.debug({ query, params }, 'Executing database query');

// INFO: Normal operations, business events
logger.info({ orderId, total }, 'Order placed successfully');

// WARN: Unexpected but handled situations
logger.warn({ queueDepth }, 'Message queue depth exceeds threshold');

// ERROR: Failed operations that need attention
logger.error({ orderId, error: err.message }, 'Order processing failed');

// FATAL: Application cannot continue
logger.fatal({ error: err.message }, 'Database connection lost, shutting down');

Rule of thumb:

  • DEBUG: Developers troubleshooting
  • INFO: Operations monitoring normal flow
  • WARN: Something to investigate soon
  • ERROR: Something to investigate now
  • FATAL: Wake someone up

Request Context

Wrap requests to automatically include context:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
const { AsyncLocalStorage } = require('async_hooks');
const asyncLocalStorage = new AsyncLocalStorage();

function requestLogger(req, res, next) {
  const context = {
    requestId: req.headers['x-request-id'] || uuid(),
    traceId: req.headers['x-trace-id'],
    userId: req.user?.id,
    path: req.path,
    method: req.method,
  };
  
  asyncLocalStorage.run(context, () => next());
}

// Logger automatically includes context
const logger = {
  info(data, message) {
    const context = asyncLocalStorage.getStore() || {};
    pino.info({ ...context, ...data }, message);
  },
  // ... other levels
};

Now every log in that request automatically includes requestId, userId, etc.

Sensitive Data

Never log:

  • Passwords or tokens
  • Full credit card numbers
  • Social security numbers
  • API keys or secrets
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
// DON'T
logger.info({ password: user.password }, 'User created');

// DO
logger.info({ userId: user.id, email: user.email }, 'User created');

// Redact sensitive fields automatically
const logger = pino({
  redact: ['password', 'creditCard', 'ssn', 'authorization'],
});

Log Aggregation

Structured logs shine with aggregation tools:

ELK Stack (Elasticsearch, Logstash, Kibana):

AppstdoutFilebeatLogstashElasticsearchKibana

Loki + Grafana:

AppstdoutPromtailLokiGrafana

Cloud-native:

  • AWS CloudWatch Logs Insights
  • GCP Cloud Logging
  • Datadog, Splunk, etc.

Querying Examples

With logs in Elasticsearch:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// Errors in the last hour
GET /logs/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": { "level": "error" } },
        { "range": { "timestamp": { "gte": "now-1h" } } }
      ]
    }
  }
}

// Slow requests (>1s) by endpoint
GET /logs/_search
{
  "query": { "range": { "duration_ms": { "gte": 1000 } } },
  "aggs": {
    "by_path": {
      "terms": { "field": "path.keyword" }
    }
  }
}

Alerting on Logs

Set up alerts for patterns:

1
2
3
4
5
6
7
8
# Alert: Error rate spike
- alert: HighErrorRate
  expr: |
    sum(rate(log_entries_total{level="error"}[5m])) 
    / sum(rate(log_entries_total[5m])) > 0.05
  for: 5m
  annotations:
    summary: "Error rate above 5%"

Common Mistakes

Logging too much:

1
2
3
4
5
6
7
// DON'T: Log every iteration
for (const item of items) {
  logger.debug({ item }, 'Processing item');  // 10,000 logs
}

// DO: Log aggregates
logger.info({ count: items.length }, 'Processing batch');

Logging too little:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
// DON'T: Swallow errors silently
try { riskyOperation(); } catch (e) { /* nothing */ }

// DO: Log with context
try { 
  riskyOperation(); 
} catch (e) { 
  logger.error({ error: e.message, stack: e.stack }, 'Operation failed');
  throw e;
}

Inconsistent field names:

1
2
3
4
5
6
7
// DON'T: Different names for same concept
logger.info({ user_id: '123' });
logger.info({ userId: '123' });
logger.info({ uid: '123' });

// DO: Standardize
logger.info({ userId: '123' });

The Mental Model

Think of structured logs as a time-series database of events:

  • Each log is a row
  • Each field is a column
  • You can query, aggregate, and alert

Plain text logs are like writing notes in a notebook. Structured logs are like entering data in a spreadsheet. The notebook is fine for personal use; the spreadsheet is what you need when you have to answer questions at scale.

Your future self debugging a production incident at 3 AM will thank you for the structure.