Tooling

grep and find: Search Patterns for the Command Line

Two commands solve 90% of search problems on Unix systems: grep for text patterns and find for file locations. Master these and you’ll navigate any codebase. grep Basics 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # Search for pattern in file grep "error" logfile.txt # Case insensitive grep -i "error" logfile.txt # Show line numbers grep -n "error" logfile.txt # Count matches grep -c "error" logfile.txt # Invert match (lines NOT matching) grep -v "debug" logfile.txt grep in Multiple Files 1 2 3 4 5 6 7 8 9 10 11 # Search all files in directory grep "TODO" *.py # Recursive search grep -r "TODO" src/ # Show only filenames grep -l "TODO" *.py # Show filenames with no match grep -L "TODO" *.py grep with Context 1 2 3 4 5 6 7 8 # 3 lines before match grep -B 3 "error" logfile.txt # 3 lines after match grep -A 3 "error" logfile.txt # 3 lines before and after grep -C 3 "error" logfile.txt grep Regular Expressions 1 2 3 4 5 6 7 8 9 10 11 # Basic regex (default) grep "error.*failed" logfile.txt # Extended regex grep -E "error|warning|critical" logfile.txt # Or use egrep egrep "error|warning" logfile.txt # Perl regex (most powerful) grep -P "\d{4}-\d{2}-\d{2}" logfile.txt Common Patterns 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # IP addresses grep -E "\b[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\b" access.log # Email addresses grep -E "[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}" users.txt # URLs grep -E "https?://[^\s]+" document.txt # Timestamps grep -P "\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}" app.log # Word boundaries grep -w "error" logfile.txt # Won't match "errors" or "error_code" grep with Exclusions 1 2 3 4 5 6 7 8 # Exclude directories grep -r "TODO" --exclude-dir=node_modules --exclude-dir=.git . # Exclude file patterns grep -r "TODO" --exclude="*.min.js" --exclude="*.map" . # Include only certain files grep -r "TODO" --include="*.py" --include="*.js" . find Basics 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # Find by name find . -name "*.py" # Case insensitive name find . -iname "readme*" # Find directories find . -type d -name "test*" # Find files find . -type f -name "*.log" # Find links find . -type l find by Time 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # Modified in last 7 days find . -mtime -7 # Modified more than 30 days ago find . -mtime +30 # Modified in last 60 minutes find . -mmin -60 # Accessed in last day find . -atime -1 # Changed (metadata) in last day find . -ctime -1 find by Size 1 2 3 4 5 6 7 8 9 10 11 # Files larger than 100MB find . -size +100M # Files smaller than 1KB find . -size -1k # Files exactly 0 bytes (empty) find . -size 0 # Size units: c (bytes), k (KB), M (MB), G (GB) find . -size +1G find by Permissions 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # Executable files find . -perm /u+x -type f # World-writable files find . -perm -002 # SUID files find . -perm -4000 # Files owned by user find . -user root # Files owned by group find . -group www-data find with Actions 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # Print (default) find . -name "*.log" -print # Delete (careful!) find . -name "*.tmp" -delete # Execute command for each file find . -name "*.py" -exec wc -l {} \; # Execute command with all files at once find . -name "*.py" -exec wc -l {} + # Interactive delete find . -name "*.bak" -ok rm {} \; find Logical Operators 1 2 3 4 5 6 7 8 9 10 11 # AND (implicit) find . -name "*.py" -size +100k # OR find . -name "*.py" -o -name "*.js" # NOT find . ! -name "*.pyc" # Grouping find . $ -name "*.py" -o -name "*.js" $ -size +10k Combining grep and find 1 2 3 4 5 6 7 8 9 10 11 # Find files and grep in them find . -name "*.py" -exec grep -l "import os" {} \; # More efficient with xargs find . -name "*.py" | xargs grep -l "import os" # Handle spaces in filenames find . -name "*.py" -print0 | xargs -0 grep -l "import os" # Find recently modified files with pattern find . -name "*.log" -mtime -1 -exec grep "ERROR" {} + Practical Examples Find Large Files 1 2 3 4 5 # Top 10 largest files find . -type f -exec du -h {} + | sort -rh | head -10 # Files over 100MB, sorted find . -type f -size +100M -exec ls -lh {} \; | sort -k5 -h Find and Clean 1 2 3 4 5 6 7 8 9 # Remove old log files find /var/log -name "*.log" -mtime +30 -delete # Remove empty directories find . -type d -empty -delete # Remove Python cache find . -type d -name "__pycache__" -exec rm -rf {} + find . -name "*.pyc" -delete Search Code 1 2 3 4 5 6 7 8 # Find function definitions grep -rn "def " --include="*.py" src/ # Find TODO comments grep -rn "TODO\|FIXME\|XXX" --include="*.py" . # Find imports grep -r "^import\|^from" --include="*.py" src/ | sort -u Search Logs 1 2 3 4 5 6 7 8 # Errors in last hour find /var/log -name "*.log" -mmin -60 -exec grep -l "ERROR" {} \; # Count errors per file find /var/log -name "*.log" -exec sh -c 'echo "$1: $(grep -c ERROR "$1")"' _ {} \; # Unique IPs from access log grep -oE "\b[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+\b" access.log | sort -u Find Duplicates 1 2 3 4 5 # Find files with same name find . -type f -name "*.py" | xargs -I{} basename {} | sort | uniq -d # Find by content hash (requires md5sum) find . -type f -exec md5sum {} \; | sort | uniq -w32 -d Performance Tips 1 2 3 4 5 6 7 8 9 10 11 # Stop at first match (faster) grep -m 1 "pattern" largefile.txt # Use fixed strings when possible (faster than regex) grep -F "exact string" file.txt # Limit find depth find . -maxdepth 2 -name "*.py" # Prune directories find . -path ./node_modules -prune -o -name "*.js" -print ripgrep (Modern Alternative) If available, rg is faster than grep: ...

curl for API Testing: The Essential Commands

Before Postman, before Insomnia, there was curl. It’s still the fastest way to test an API, and it’s available on every server you’ll ever SSH into. Basic Requests 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # GET request curl https://api.example.com/users # With headers shown curl -i https://api.example.com/users # Headers only curl -I https://api.example.com/users # Silent (no progress bar) curl -s https://api.example.com/users # Follow redirects curl -L https://api.example.com/old-endpoint HTTP Methods 1 2 3 4 5 6 7 8 9 10 11 # POST curl -X POST https://api.example.com/users # PUT curl -X PUT https://api.example.com/users/123 # PATCH curl -X PATCH https://api.example.com/users/123 # DELETE curl -X DELETE https://api.example.com/users/123 Sending Data JSON Body 1 2 3 4 5 6 7 8 9 10 11 curl -X POST https://api.example.com/users \ -H "Content-Type: application/json" \ -d '{"name": "Alice", "email": "alice@example.com"}' # From file curl -X POST https://api.example.com/users \ -H "Content-Type: application/json" \ -d @user.json # Pretty JSON with jq curl -s https://api.example.com/users | jq . Form Data 1 2 3 4 5 6 7 8 # URL-encoded form curl -X POST https://api.example.com/login \ -d "username=alice&password=secret" # Multipart form (file upload) curl -X POST https://api.example.com/upload \ -F "file=@photo.jpg" \ -F "description=My photo" Query Parameters 1 2 3 4 5 6 7 # In URL curl "https://api.example.com/search?q=hello&limit=10" # With --data-urlencode (safer) curl -G https://api.example.com/search \ --data-urlencode "q=hello world" \ --data-urlencode "limit=10" Headers 1 2 3 4 5 6 7 8 9 10 11 12 13 # Single header curl -H "Authorization: Bearer token123" https://api.example.com/me # Multiple headers curl -H "Authorization: Bearer token123" \ -H "Accept: application/json" \ -H "X-Request-ID: abc123" \ https://api.example.com/me # Common patterns curl -H "Content-Type: application/json" ... curl -H "Accept: application/json" ... curl -H "Authorization: Basic $(echo -n 'user:pass' | base64)" ... Authentication Bearer Token 1 curl -H "Authorization: Bearer eyJhbGc..." https://api.example.com/me Basic Auth 1 2 3 4 curl -u username:password https://api.example.com/secure # Or manually curl -H "Authorization: Basic $(echo -n 'user:pass' | base64)" https://api.example.com/secure API Key 1 2 3 4 5 # In header curl -H "X-API-Key: your-api-key" https://api.example.com/data # In query string curl "https://api.example.com/data?api_key=your-api-key" Response Handling Save to File 1 2 3 4 5 6 7 8 # Save response body curl -o response.json https://api.example.com/data # Save with remote filename curl -O https://example.com/file.zip # Save headers and body separately curl -D headers.txt -o body.json https://api.example.com/data Extract Specific Info 1 2 3 4 5 6 7 8 9 # HTTP status code only curl -s -o /dev/null -w "%{http_code}" https://api.example.com/health # Response time curl -s -o /dev/null -w "%{time_total}" https://api.example.com/health # Multiple metrics curl -s -o /dev/null -w "Status: %{http_code}\nTime: %{time_total}s\nSize: %{size_download} bytes\n" \ https://api.example.com/data Write-Out Variables 1 2 3 4 5 6 7 8 9 curl -w " HTTP Code: %{http_code} Total Time: %{time_total}s DNS Lookup: %{time_namelookup}s Connect: %{time_connect}s TTFB: %{time_starttransfer}s Download Size: %{size_download} bytes Download Speed: %{speed_download} bytes/sec " -o /dev/null -s https://api.example.com/data Debugging Verbose Output 1 2 3 4 5 6 7 8 # Show request and response headers curl -v https://api.example.com/data # Even more verbose (includes TLS handshake) curl -vv https://api.example.com/data # Trace everything (hex dump) curl --trace - https://api.example.com/data Debug TLS/SSL 1 2 3 4 5 6 7 8 9 10 11 # Show certificate info curl -vI https://api.example.com 2>&1 | grep -A 10 "Server certificate" # Skip certificate verification (don't use in production!) curl -k https://self-signed.example.com/data # Use specific CA cert curl --cacert /path/to/ca.crt https://api.example.com/data # Client certificate auth curl --cert client.crt --key client.key https://api.example.com/data Timeouts and Retries 1 2 3 4 5 6 7 8 9 10 11 # Connection timeout (seconds) curl --connect-timeout 5 https://api.example.com/data # Max time for entire operation curl --max-time 30 https://api.example.com/data # Retry on failure curl --retry 3 --retry-delay 2 https://api.example.com/data # Retry only on specific errors curl --retry 3 --retry-connrefused https://api.example.com/data Cookies 1 2 3 4 5 6 7 8 9 10 11 # Send cookie curl -b "session=abc123" https://api.example.com/dashboard # Save cookies to file curl -c cookies.txt https://api.example.com/login -d "user=alice&pass=secret" # Use saved cookies curl -b cookies.txt https://api.example.com/dashboard # Both save and send curl -b cookies.txt -c cookies.txt https://api.example.com/page Proxies 1 2 3 4 5 6 7 8 # HTTP proxy curl -x http://proxy.example.com:8080 https://api.example.com/data # SOCKS5 proxy curl --socks5 localhost:1080 https://api.example.com/data # Proxy with auth curl -x http://user:pass@proxy.example.com:8080 https://api.example.com/data Practical Examples Health Check Script 1 2 3 4 5 6 7 8 9 10 11 #!/bin/bash URL="https://api.example.com/health" STATUS=$(curl -s -o /dev/null -w "%{http_code}" --max-time 5 "$URL") if [ "$STATUS" = "200" ]; then echo "OK" exit 0 else echo "FAIL: HTTP $STATUS" exit 1 fi API Test with Error Handling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 #!/bin/bash response=$(curl -s -w "\n%{http_code}" \ -X POST https://api.example.com/users \ -H "Content-Type: application/json" \ -d '{"name": "Test User"}') body=$(echo "$response" | head -n -1) status=$(echo "$response" | tail -n 1) if [ "$status" = "201" ]; then echo "Created user: $(echo "$body" | jq -r '.id')" else echo "Error $status: $(echo "$body" | jq -r '.error')" exit 1 fi Pagination Loop 1 2 3 4 5 6 7 8 9 10 11 12 13 #!/bin/bash page=1 while true; do response=$(curl -s "https://api.example.com/items?page=$page&per_page=100") count=$(echo "$response" | jq '.items | length') if [ "$count" = "0" ]; then break fi echo "$response" | jq -r '.items[].name' ((page++)) done OAuth Token Flow 1 2 3 4 5 6 7 8 9 # Get token TOKEN=$(curl -s -X POST https://auth.example.com/oauth/token \ -d "grant_type=client_credentials" \ -d "client_id=$CLIENT_ID" \ -d "client_secret=$CLIENT_SECRET" \ | jq -r '.access_token') # Use token curl -H "Authorization: Bearer $TOKEN" https://api.example.com/data Config Files Save common options in ~/.curlrc: ...

Python Virtual Environments: A Practical Guide

Every Python project should have its own virtual environment. It’s not optional — it’s how you avoid dependency hell, reproducibility issues, and the dreaded “but it works on my machine.” Why Virtual Environments? Without virtual environments: Project A needs requests==2.25 Project B needs requests==2.31 Both use system Python One project breaks With virtual environments: Each project has isolated dependencies Different Python versions per project Reproducible across machines No sudo required for installing packages The Built-in Way: venv Python 3.3+ includes venv: ...

jq: Command-Line JSON Processing

If you work with APIs, logs, or config files, you work with JSON. And jq is how you make that not painful. It’s like sed and awk had a baby that speaks JSON fluently. The Basics 1 2 3 4 5 6 7 8 9 10 # Pretty-print JSON curl -s https://api.example.com/data | jq . # Get a field echo '{"name": "Alice", "age": 30}' | jq '.name' # "Alice" # Get raw string (no quotes) echo '{"name": "Alice"}' | jq -r '.name' # Alice Navigating Objects 1 2 3 4 5 6 7 8 9 10 11 # Nested fields echo '{"user": {"name": "Alice", "email": "alice@example.com"}}' | jq '.user.name' # "Alice" # Multiple fields echo '{"name": "Alice", "age": 30, "city": "NYC"}' | jq '{name, city}' # {"name": "Alice", "city": "NYC"} # Rename fields echo '{"firstName": "Alice"}' | jq '{name: .firstName}' # {"name": "Alice"} Working with Arrays 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 # Get first element echo '[1, 2, 3]' | jq '.[0]' # 1 # Get last element echo '[1, 2, 3]' | jq '.[-1]' # 3 # Slice echo '[1, 2, 3, 4, 5]' | jq '.[1:3]' # [2, 3] # Get all elements echo '[{"name": "Alice"}, {"name": "Bob"}]' | jq '.[].name' # "Alice" # "Bob" # Wrap results in array echo '[{"name": "Alice"}, {"name": "Bob"}]' | jq '[.[].name]' # ["Alice", "Bob"] Filtering 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 # Select where condition is true echo '[{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]' | \ jq '.[] | select(.age > 26)' # {"name": "Alice", "age": 30} # Multiple conditions echo '[{"name": "Alice", "active": true}, {"name": "Bob", "active": false}]' | \ jq '.[] | select(.active == true and .name != "Admin")' # Contains echo '[{"tags": ["dev", "prod"]}, {"tags": ["staging"]}]' | \ jq '.[] | select(.tags | contains(["prod"]))' # String matching echo '[{"name": "Alice"}, {"name": "Bob"}, {"name": "Alicia"}]' | \ jq '.[] | select(.name | startswith("Ali"))' Transforming Data 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 # Map over array echo '[1, 2, 3]' | jq 'map(. * 2)' # [2, 4, 6] # Map with objects echo '[{"name": "Alice", "score": 85}, {"name": "Bob", "score": 92}]' | \ jq 'map({name, passed: .score >= 90})' # [{"name": "Alice", "passed": false}, {"name": "Bob", "passed": true}] # Add fields echo '{"name": "Alice"}' | jq '. + {role: "admin"}' # {"name": "Alice", "role": "admin"} # Update fields echo '{"name": "alice"}' | jq '.name |= ascii_upcase' # {"name": "ALICE"} # Delete fields echo '{"name": "Alice", "password": "secret"}' | jq 'del(.password)' # {"name": "Alice"} Aggregation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 # Length echo '[1, 2, 3, 4, 5]' | jq 'length' # 5 # Sum echo '[1, 2, 3, 4, 5]' | jq 'add' # 15 # Min/Max echo '[3, 1, 4, 1, 5]' | jq 'min, max' # 1 # 5 # Unique echo '[1, 2, 2, 3, 3, 3]' | jq 'unique' # [1, 2, 3] # Group by echo '[{"type": "a", "val": 1}, {"type": "b", "val": 2}, {"type": "a", "val": 3}]' | \ jq 'group_by(.type) | map({type: .[0].type, total: map(.val) | add})' # [{"type": "a", "total": 4}, {"type": "b", "total": 2}] # Sort echo '[{"name": "Bob"}, {"name": "Alice"}]' | jq 'sort_by(.name)' # [{"name": "Alice"}, {"name": "Bob"}] String Operations 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 # Split echo '{"path": "/usr/local/bin"}' | jq '.path | split("/")' # ["", "usr", "local", "bin"] # Join echo '["a", "b", "c"]' | jq 'join("-")' # "a-b-c" # Replace echo '{"msg": "Hello World"}' | jq '.msg | gsub("World"; "jq")' # "Hello jq" # String interpolation echo '{"name": "Alice", "age": 30}' | jq '"Name: \(.name), Age: \(.age)"' # "Name: Alice, Age: 30" Working with Keys 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 # Get keys echo '{"a": 1, "b": 2, "c": 3}' | jq 'keys' # ["a", "b", "c"] # Get values echo '{"a": 1, "b": 2, "c": 3}' | jq 'values' # Doesn't exist - use: [.[]] # Convert object to array of key-value pairs echo '{"a": 1, "b": 2}' | jq 'to_entries' # [{"key": "a", "value": 1}, {"key": "b", "value": 2}] # Convert back echo '[{"key": "a", "value": 1}]' | jq 'from_entries' # {"a": 1} # Transform keys echo '{"firstName": "Alice", "lastName": "Smith"}' | \ jq 'with_entries(.key |= ascii_downcase)' # {"firstname": "Alice", "lastname": "Smith"} Conditionals 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # If-then-else echo '{"status": 200}' | jq 'if .status == 200 then "OK" else "Error" end' # "OK" # Alternative operator (default value) echo '{"name": "Alice"}' | jq '.age // 0' # 0 echo '{"name": "Alice", "age": 30}' | jq '.age // 0' # 30 # Try (ignore errors) echo '{"data": "not json"}' | jq '.data | try fromjson' # null Real-World Examples Parse API Response 1 2 3 4 5 6 7 # Extract users from paginated API curl -s 'https://api.example.com/users' | \ jq '.data[] | {id, name: .attributes.name, email: .attributes.email}' # Get specific fields as TSV curl -s 'https://api.example.com/users' | \ jq -r '.data[] | [.id, .attributes.name] | @tsv' Process Log Files 1 2 3 4 5 6 7 8 # Parse JSON logs, filter errors cat app.log | jq -c 'select(.level == "error")' # Count by status code cat access.log | jq -s 'group_by(.status) | map({status: .[0].status, count: length})' # Get unique IPs cat access.log | jq -s '[.[].ip] | unique' Transform Config Files 1 2 3 4 5 6 7 8 # Merge configs jq -s '.[0] * .[1]' base.json override.json # Update nested value jq '.database.host = "newhost.example.com"' config.json # Add to array jq '.allowed_ips += ["10.0.0.5"]' config.json AWS CLI Output 1 2 3 4 5 6 7 8 9 10 11 12 # List EC2 instance IDs and states aws ec2 describe-instances | \ jq -r '.Reservations[].Instances[] | [.InstanceId, .State.Name] | @tsv' # Get running instances aws ec2 describe-instances | \ jq '.Reservations[].Instances[] | select(.State.Name == "running") | .InstanceId' # Format as table aws ec2 describe-instances | \ jq -r '["ID", "Type", "State"], (.Reservations[].Instances[] | [.InstanceId, .InstanceType, .State.Name]) | @tsv' | \ column -t Kubernetes 1 2 3 4 5 6 7 8 9 10 11 # Get pod names and statuses kubectl get pods -o json | \ jq -r '.items[] | [.metadata.name, .status.phase] | @tsv' # Find pods not running kubectl get pods -o json | \ jq '.items[] | select(.status.phase != "Running") | .metadata.name' # Get container images kubectl get pods -o json | \ jq -r '[.items[].spec.containers[].image] | unique | .[]' Output Formats 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 # Compact (one line) echo '{"a": 1}' | jq -c . # {"a":1} # Tab-separated echo '[{"a": 1, "b": 2}]' | jq -r '.[] | [.a, .b] | @tsv' # 1 2 # CSV echo '[{"a": 1, "b": 2}]' | jq -r '.[] | [.a, .b] | @csv' # 1,2 # URI encode echo '{"q": "hello world"}' | jq -r '.q | @uri' # hello%20world # Base64 echo '{"data": "hello"}' | jq -r '.data | @base64' # aGVsbG8= Quick Reference Pattern Description .field Get field .[] Iterate array .[0] First element .[-1] Last element select(cond) Filter map(expr) Transform array . + {} Add fields del(.field) Remove field // default Default value @tsv, @csv Output format -r Raw output -c Compact output -s Slurp (read all input as array) jq has a learning curve, but it pays off quickly. Once you internalize the patterns, you’ll wonder how you ever worked with JSON without it. Start with .field, .[].field, and select() — those three cover 80% of use cases. ...

Makefile Patterns: Task Running That Works Everywhere

Make has been around since 1976. It’s installed on virtually every Unix system. And while it was designed for compiling C programs, it’s become a universal task runner for any project. No npm, no pip, no cargo — just make. Why Make? Zero dependencies — Already on your system Declarative — Describe what you want, not how to get it Incremental — Only runs what’s needed Self-documenting — make help shows available targets Universal — Works the same on Linux, macOS, CI systems Basic Structure 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 # Variables APP_NAME := myapp VERSION := 1.0.0 # Default target (runs when you just type 'make') .DEFAULT_GOAL := help # Phony targets don't create files .PHONY: build test clean help build: go build -o $(APP_NAME) ./cmd/$(APP_NAME) test: go test ./... clean: rm -f $(APP_NAME) help: @echo "Available targets:" @echo " build - Build the application" @echo " test - Run tests" @echo " clean - Remove build artifacts" Self-Documenting Makefiles The best pattern: automatic help generation from comments. ...