Logs are the most underrated debugging tool. Here's how to write logs that actually help when things break.
March 5, 2026 · 7 min · 1336 words · Rob Washington
Table of Contents
Everyone logs. Few log well. The difference between “we have logs” and “we can debug with logs” comes down to discipline in what you capture, how you structure it, and where you send it.
# NEVERlogger.debug(f"Connecting with password: {password}")logger.info(f"API key: {api_key}")# Instead, log that you're using credentials, not what they arelogger.info("database_connecting",extra={"host":host,"user":user})
A single user action can touch dozens of services. Without correlation, you’re searching haystacks:
1
2
3
4
5
6
7
8
9
10
11
12
# Generate at the edge, propagate everywhereREQUEST_ID_HEADER="X-Request-ID"defget_correlation_id():returncontext.get("request_id")orstr(uuid4())# Every log includes itclassCorrelatedLogger:definfo(self,msg,**kwargs):kwargs["request_id"]=get_correlation_id()kwargs["service"]=SERVICE_NAMEunderlying_logger.info(msg,extra=kwargs)
Now tracing a request across services is a single query:
try:result=process_order(order)exceptValidationErrorase:logger.error("order_validation_failed",extra={"order_id":order.id,"user_id":order.user_id,"validation_errors":e.errors,"order_total":order.total,"item_count":len(order.items)})raiseexceptExternalServiceErrorase:logger.error("order_processing_failed",extra={"order_id":order.id,"service":e.service_name,"error_code":e.code,"retry_count":attempt_number,"will_retry":attempt_number<MAX_RETRIES},exc_info=True)# Include stack traceraise
The goal: someone reading this log at 3am should understand what happened without reading code.
Some events happen millions of times. Log strategically:
1
2
3
4
5
6
7
8
9
10
11
12
13
importrandomdefshould_sample(event_type:str,rate:float=0.01)->bool:"""Sample 1% of events by default."""returnrandom.random()<rate# Always log errorsifresponse.status>=500:logger.error("request_failed",extra=context)# Sample successful requestselifshould_sample("request_success"):logger.info("request_completed",extra={**context,"sampled":True})
Or sample intelligently:
1
2
3
4
5
6
7
8
9
defsmart_sample(context:dict)->bool:# Always log slow requestsifcontext["duration_ms"]>1000:returnTrue# Always log new users' first requestsifcontext.get("is_first_request"):returnTrue# Sample the restreturnrandom.random()<0.01