grep and find: Search Patterns for the Command Line

Two commands solve 90% of search problems on Unix systems: grep for text patterns and find for file locations. Master these and you’ll navigate any codebase. grep Basics 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # Search for pattern in file grep "error" logfile.txt # Case insensitive grep -i "error" logfile.txt # Show line numbers grep -n "error" logfile.txt # Count matches grep -c "error" logfile.txt # Invert match (lines NOT matching) grep -v "debug" logfile.txt grep in Multiple Files 1 2 3 4 5 6 7 8 9 10 11 # Search all files in directory grep "TODO" *.py # Recursive search grep -r "TODO" src/ # Show only filenames grep -l "TODO" *.py # Show filenames with no match grep -L "TODO" *.py grep with Context 1 2 3 4 5 6 7 8 # 3 lines before match grep -B 3 "error" logfile.txt # 3 lines after match grep -A 3 "error" logfile.txt # 3 lines before and after grep -C 3 "error" logfile.txt grep Regular Expressions 1 2 3 4 5 6 7 8 9 10 11 # Basic regex (default) grep "error.*failed" logfile.txt # Extended regex grep -E "error|warning|critical" logfile.txt # Or use egrep egrep "error|warning" logfile.txt # Perl regex (most powerful) grep -P "\d{4}-\d{2}-\d{2}" logfile.txt Common Patterns 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # IP addresses grep -E "\b[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\b" access.log # Email addresses grep -E "[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}" users.txt # URLs grep -E "https?://[^\s]+" document.txt # Timestamps grep -P "\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}" app.log # Word boundaries grep -w "error" logfile.txt # Won't match "errors" or "error_code" grep with Exclusions 1 2 3 4 5 6 7 8 # Exclude directories grep -r "TODO" --exclude-dir=node_modules --exclude-dir=.git . # Exclude file patterns grep -r "TODO" --exclude="*.min.js" --exclude="*.map" . # Include only certain files grep -r "TODO" --include="*.py" --include="*.js" . find Basics 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # Find by name find . -name "*.py" # Case insensitive name find . -iname "readme*" # Find directories find . -type d -name "test*" # Find files find . -type f -name "*.log" # Find links find . -type l find by Time 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # Modified in last 7 days find . -mtime -7 # Modified more than 30 days ago find . -mtime +30 # Modified in last 60 minutes find . -mmin -60 # Accessed in last day find . -atime -1 # Changed (metadata) in last day find . -ctime -1 find by Size 1 2 3 4 5 6 7 8 9 10 11 # Files larger than 100MB find . -size +100M # Files smaller than 1KB find . -size -1k # Files exactly 0 bytes (empty) find . -size 0 # Size units: c (bytes), k (KB), M (MB), G (GB) find . -size +1G find by Permissions 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # Executable files find . -perm /u+x -type f # World-writable files find . -perm -002 # SUID files find . -perm -4000 # Files owned by user find . -user root # Files owned by group find . -group www-data find with Actions 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # Print (default) find . -name "*.log" -print # Delete (careful!) find . -name "*.tmp" -delete # Execute command for each file find . -name "*.py" -exec wc -l {} \; # Execute command with all files at once find . -name "*.py" -exec wc -l {} + # Interactive delete find . -name "*.bak" -ok rm {} \; find Logical Operators 1 2 3 4 5 6 7 8 9 10 11 # AND (implicit) find . -name "*.py" -size +100k # OR find . -name "*.py" -o -name "*.js" # NOT find . ! -name "*.pyc" # Grouping find . \( -name "*.py" -o -name "*.js" \) -size +10k Combining grep and find 1 2 3 4 5 6 7 8 9 10 11 # Find files and grep in them find . -name "*.py" -exec grep -l "import os" {} \; # More efficient with xargs find . -name "*.py" | xargs grep -l "import os" # Handle spaces in filenames find . -name "*.py" -print0 | xargs -0 grep -l "import os" # Find recently modified files with pattern find . -name "*.log" -mtime -1 -exec grep "ERROR" {} + Practical Examples Find Large Files 1 2 3 4 5 # Top 10 largest files find . -type f -exec du -h {} + | sort -rh | head -10 # Files over 100MB, sorted find . -type f -size +100M -exec ls -lh {} \; | sort -k5 -h Find and Clean 1 2 3 4 5 6 7 8 9 # Remove old log files find /var/log -name "*.log" -mtime +30 -delete # Remove empty directories find . -type d -empty -delete # Remove Python cache find . -type d -name "__pycache__" -exec rm -rf {} + find . -name "*.pyc" -delete Search Code 1 2 3 4 5 6 7 8 # Find function definitions grep -rn "def " --include="*.py" src/ # Find TODO comments grep -rn "TODO\|FIXME\|XXX" --include="*.py" . # Find imports grep -r "^import\|^from" --include="*.py" src/ | sort -u Search Logs 1 2 3 4 5 6 7 8 # Errors in last hour find /var/log -name "*.log" -mmin -60 -exec grep -l "ERROR" {} \; # Count errors per file find /var/log -name "*.log" -exec sh -c 'echo "$1: $(grep -c ERROR "$1")"' _ {} \; # Unique IPs from access log grep -oE "\b[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+\b" access.log | sort -u Find Duplicates 1 2 3 4 5 # Find files with same name find . -type f -name "*.py" | xargs -I{} basename {} | sort | uniq -d # Find by content hash (requires md5sum) find . -type f -exec md5sum {} \; | sort | uniq -w32 -d Performance Tips 1 2 3 4 5 6 7 8 9 10 11 # Stop at first match (faster) grep -m 1 "pattern" largefile.txt # Use fixed strings when possible (faster than regex) grep -F "exact string" file.txt # Limit find depth find . -maxdepth 2 -name "*.py" # Prune directories find . -path ./node_modules -prune -o -name "*.js" -print ripgrep (Modern Alternative) If available, rg is faster than grep: ...

February 24, 2026 · 6 min · 1235 words · Rob Washington

jq: Command-Line JSON Processing

If you work with APIs, logs, or config files, you work with JSON. And jq is how you make that not painful. It’s like sed and awk had a baby that speaks JSON fluently. The Basics 1 2 3 4 5 6 7 8 9 10 # Pretty-print JSON curl -s https://api.example.com/data | jq . # Get a field echo '{"name": "Alice", "age": 30}' | jq '.name' # "Alice" # Get raw string (no quotes) echo '{"name": "Alice"}' | jq -r '.name' # Alice Navigating Objects 1 2 3 4 5 6 7 8 9 10 11 # Nested fields echo '{"user": {"name": "Alice", "email": "alice@example.com"}}' | jq '.user.name' # "Alice" # Multiple fields echo '{"name": "Alice", "age": 30, "city": "NYC"}' | jq '{name, city}' # {"name": "Alice", "city": "NYC"} # Rename fields echo '{"firstName": "Alice"}' | jq '{name: .firstName}' # {"name": "Alice"} Working with Arrays 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 # Get first element echo '[1, 2, 3]' | jq '.[0]' # 1 # Get last element echo '[1, 2, 3]' | jq '.[-1]' # 3 # Slice echo '[1, 2, 3, 4, 5]' | jq '.[1:3]' # [2, 3] # Get all elements echo '[{"name": "Alice"}, {"name": "Bob"}]' | jq '.[].name' # "Alice" # "Bob" # Wrap results in array echo '[{"name": "Alice"}, {"name": "Bob"}]' | jq '[.[].name]' # ["Alice", "Bob"] Filtering 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 # Select where condition is true echo '[{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]' | \ jq '.[] | select(.age > 26)' # {"name": "Alice", "age": 30} # Multiple conditions echo '[{"name": "Alice", "active": true}, {"name": "Bob", "active": false}]' | \ jq '.[] | select(.active == true and .name != "Admin")' # Contains echo '[{"tags": ["dev", "prod"]}, {"tags": ["staging"]}]' | \ jq '.[] | select(.tags | contains(["prod"]))' # String matching echo '[{"name": "Alice"}, {"name": "Bob"}, {"name": "Alicia"}]' | \ jq '.[] | select(.name | startswith("Ali"))' Transforming Data 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 # Map over array echo '[1, 2, 3]' | jq 'map(. * 2)' # [2, 4, 6] # Map with objects echo '[{"name": "Alice", "score": 85}, {"name": "Bob", "score": 92}]' | \ jq 'map({name, passed: .score >= 90})' # [{"name": "Alice", "passed": false}, {"name": "Bob", "passed": true}] # Add fields echo '{"name": "Alice"}' | jq '. + {role: "admin"}' # {"name": "Alice", "role": "admin"} # Update fields echo '{"name": "alice"}' | jq '.name |= ascii_upcase' # {"name": "ALICE"} # Delete fields echo '{"name": "Alice", "password": "secret"}' | jq 'del(.password)' # {"name": "Alice"} Aggregation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 # Length echo '[1, 2, 3, 4, 5]' | jq 'length' # 5 # Sum echo '[1, 2, 3, 4, 5]' | jq 'add' # 15 # Min/Max echo '[3, 1, 4, 1, 5]' | jq 'min, max' # 1 # 5 # Unique echo '[1, 2, 2, 3, 3, 3]' | jq 'unique' # [1, 2, 3] # Group by echo '[{"type": "a", "val": 1}, {"type": "b", "val": 2}, {"type": "a", "val": 3}]' | \ jq 'group_by(.type) | map({type: .[0].type, total: map(.val) | add})' # [{"type": "a", "total": 4}, {"type": "b", "total": 2}] # Sort echo '[{"name": "Bob"}, {"name": "Alice"}]' | jq 'sort_by(.name)' # [{"name": "Alice"}, {"name": "Bob"}] String Operations 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 # Split echo '{"path": "/usr/local/bin"}' | jq '.path | split("/")' # ["", "usr", "local", "bin"] # Join echo '["a", "b", "c"]' | jq 'join("-")' # "a-b-c" # Replace echo '{"msg": "Hello World"}' | jq '.msg | gsub("World"; "jq")' # "Hello jq" # String interpolation echo '{"name": "Alice", "age": 30}' | jq '"Name: \(.name), Age: \(.age)"' # "Name: Alice, Age: 30" Working with Keys 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 # Get keys echo '{"a": 1, "b": 2, "c": 3}' | jq 'keys' # ["a", "b", "c"] # Get values echo '{"a": 1, "b": 2, "c": 3}' | jq 'values' # Doesn't exist - use: [.[]] # Convert object to array of key-value pairs echo '{"a": 1, "b": 2}' | jq 'to_entries' # [{"key": "a", "value": 1}, {"key": "b", "value": 2}] # Convert back echo '[{"key": "a", "value": 1}]' | jq 'from_entries' # {"a": 1} # Transform keys echo '{"firstName": "Alice", "lastName": "Smith"}' | \ jq 'with_entries(.key |= ascii_downcase)' # {"firstname": "Alice", "lastname": "Smith"} Conditionals 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # If-then-else echo '{"status": 200}' | jq 'if .status == 200 then "OK" else "Error" end' # "OK" # Alternative operator (default value) echo '{"name": "Alice"}' | jq '.age // 0' # 0 echo '{"name": "Alice", "age": 30}' | jq '.age // 0' # 30 # Try (ignore errors) echo '{"data": "not json"}' | jq '.data | try fromjson' # null Real-World Examples Parse API Response 1 2 3 4 5 6 7 # Extract users from paginated API curl -s 'https://api.example.com/users' | \ jq '.data[] | {id, name: .attributes.name, email: .attributes.email}' # Get specific fields as TSV curl -s 'https://api.example.com/users' | \ jq -r '.data[] | [.id, .attributes.name] | @tsv' Process Log Files 1 2 3 4 5 6 7 8 # Parse JSON logs, filter errors cat app.log | jq -c 'select(.level == "error")' # Count by status code cat access.log | jq -s 'group_by(.status) | map({status: .[0].status, count: length})' # Get unique IPs cat access.log | jq -s '[.[].ip] | unique' Transform Config Files 1 2 3 4 5 6 7 8 # Merge configs jq -s '.[0] * .[1]' base.json override.json # Update nested value jq '.database.host = "newhost.example.com"' config.json # Add to array jq '.allowed_ips += ["10.0.0.5"]' config.json AWS CLI Output 1 2 3 4 5 6 7 8 9 10 11 12 # List EC2 instance IDs and states aws ec2 describe-instances | \ jq -r '.Reservations[].Instances[] | [.InstanceId, .State.Name] | @tsv' # Get running instances aws ec2 describe-instances | \ jq '.Reservations[].Instances[] | select(.State.Name == "running") | .InstanceId' # Format as table aws ec2 describe-instances | \ jq -r '["ID", "Type", "State"], (.Reservations[].Instances[] | [.InstanceId, .InstanceType, .State.Name]) | @tsv' | \ column -t Kubernetes 1 2 3 4 5 6 7 8 9 10 11 # Get pod names and statuses kubectl get pods -o json | \ jq -r '.items[] | [.metadata.name, .status.phase] | @tsv' # Find pods not running kubectl get pods -o json | \ jq '.items[] | select(.status.phase != "Running") | .metadata.name' # Get container images kubectl get pods -o json | \ jq -r '[.items[].spec.containers[].image] | unique | .[]' Output Formats 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 # Compact (one line) echo '{"a": 1}' | jq -c . # {"a":1} # Tab-separated echo '[{"a": 1, "b": 2}]' | jq -r '.[] | [.a, .b] | @tsv' # 1 2 # CSV echo '[{"a": 1, "b": 2}]' | jq -r '.[] | [.a, .b] | @csv' # 1,2 # URI encode echo '{"q": "hello world"}' | jq -r '.q | @uri' # hello%20world # Base64 echo '{"data": "hello"}' | jq -r '.data | @base64' # aGVsbG8= Quick Reference Pattern Description .field Get field .[] Iterate array .[0] First element .[-1] Last element select(cond) Filter map(expr) Transform array . + {} Add fields del(.field) Remove field // default Default value @tsv, @csv Output format -r Raw output -c Compact output -s Slurp (read all input as array) jq has a learning curve, but it pays off quickly. Once you internalize the patterns, you’ll wonder how you ever worked with JSON without it. Start with .field, .[].field, and select() — those three cover 80% of use cases. ...

February 24, 2026 · 7 min · 1338 words · Rob Washington