You know grep "error" logfile.txt. But grep can do so much more β recursive searches, context lines, inverse matching, and regex patterns that turn hours of manual searching into seconds.
The Basics#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| # Search for pattern in file
grep "error" app.log
# Case-insensitive
grep -i "error" app.log
# Show line numbers
grep -n "error" app.log
# Count matches
grep -c "error" app.log
# Only show filenames with matches
grep -l "error" *.log
|
Recursive Search#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| # Search all files in directory tree
grep -r "TODO" ./src
# With line numbers
grep -rn "TODO" ./src
# Include only certain files
grep -r --include="*.py" "import os" .
# Exclude directories
grep -r --exclude-dir=node_modules "console.log" .
# Multiple excludes
grep -r --exclude-dir={node_modules,.git,dist} "function" .
|
Context Lines#
When you find a match, you often need surrounding context:
1
2
3
4
5
6
7
8
9
10
11
| # 3 lines after match
grep -A 3 "ERROR" app.log
# 3 lines before match
grep -B 3 "ERROR" app.log
# 3 lines before and after (context)
grep -C 3 "ERROR" app.log
# Show filename with context
grep -Hn -C 2 "Exception" *.log
|
Inverse Matching#
1
2
3
4
5
6
7
8
9
10
11
| # Lines NOT containing pattern
grep -v "DEBUG" app.log
# Multiple exclusions (pipe them)
grep -v "DEBUG" app.log | grep -v "INFO"
# Exclude blank lines
grep -v "^$" file.txt
# Exclude comments
grep -v "^#" config.ini
|
Regular Expressions#
Basic Regex (default)#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| # Word boundary
grep "\berror\b" app.log
# Start of line
grep "^ERROR" app.log
# End of line
grep "failed$" app.log
# Any character
grep "err.r" app.log # error, err0r, etc.
# Character class
grep "[0-9]" data.txt
|
Extended Regex (-E)#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
| # One or more
grep -E "er+" file.txt
# Zero or more
grep -E "er*" file.txt
# Optional
grep -E "colou?r" file.txt # color or colour
# Alternation
grep -E "error|warning|critical" app.log
# Groups
grep -E "(error|warn)ing" app.log
# Quantifiers
grep -E "[0-9]{3}-[0-9]{4}" phones.txt # 555-1234
|
Perl Regex (-P)#
For advanced patterns, use Perl-compatible regex:
1
2
3
4
5
6
7
8
| # Lookahead
grep -P "error(?=.*critical)" app.log
# Non-greedy
grep -oP '".*?"' data.json
# Named groups (extraction)
grep -oP 'user=\K[^\s]+' access.log
|
The \K resets the match start β perfect for extracting values.
1
2
3
4
5
6
7
8
9
10
11
| # Only print matching part
grep -o "error" app.log
# Extract IP addresses
grep -oE "[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+" access.log
# Extract email addresses
grep -oE "[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}" contacts.txt
# Extract URLs
grep -oP 'https?://[^\s]+' page.html
|
Fixed Strings (Faster)#
When you don’t need regex:
1
2
3
4
5
6
7
8
| # Literal string (no regex interpretation)
grep -F "user.name" config.json
# Multiple literal patterns
grep -F -e "error" -e "warning" -e "critical" app.log
# From file
grep -Ff patterns.txt data.txt
|
-F is faster because it skips regex parsing.
Practical Patterns#
Log Analysis#
1
2
3
4
5
6
7
8
| # Errors in last hour (if timestamps are sortable)
grep "$(date +%Y-%m-%d\ %H)" app.log | grep -i error
# Unique error types
grep -oE "ERROR: [^:]+:" app.log | sort -u
# Request paths with 500 errors
grep " 500 " access.log | grep -oE 'GET|POST [^ ]+'
|
Code Search#
1
2
3
4
5
6
7
8
9
10
11
| # Find function definitions
grep -rn "def \|function " --include="*.py" --include="*.js" .
# Find TODO/FIXME
grep -rn "TODO\|FIXME\|XXX" --include="*.py" .
# Find imports of specific module
grep -rn "^import os\|^from os" --include="*.py" .
# Find hardcoded passwords (security audit)
grep -rn "password\s*=\s*['\"]" --include="*.py" --include="*.js" .
|
System Administration#
1
2
3
4
5
6
7
8
9
10
11
| # Failed SSH logins
grep "Failed password" /var/log/auth.log
# Processes using CPU
ps aux | grep -v grep | grep python
# Services listening on ports
netstat -tlnp | grep LISTEN
# Find large files in df output
df -h | grep -E "[0-9]{2,}G"
|
JSON/Config Parsing#
1
2
3
4
5
6
7
8
| # Extract values (simple cases)
grep -oP '"name":\s*"\K[^"]+' data.json
# Find config keys
grep -E "^\s*[a-zA-Z_]+" config.yaml
# Non-comment lines
grep -v "^\s*#" config.ini | grep -v "^$"
|
1
2
3
4
5
6
7
8
9
10
11
| # Count occurrences by type
grep -oE "ERROR|WARN|INFO" app.log | sort | uniq -c
# Find and act on matches
grep -l "deprecated" *.py | xargs -I {} echo "Check: {}"
# Filter then process
grep "user_id" data.json | jq '.user_id'
# Monitor in real-time
tail -f app.log | grep --line-buffered "ERROR"
|
The --line-buffered flag is crucial for real-time streaming.
1
2
3
4
5
6
7
8
9
10
11
| # Use fixed strings when possible
grep -F "literal text" huge.log # Faster than regex
# Limit context retrieval
grep -m 10 "pattern" huge.log # Stop after 10 matches
# Use specific file types
grep -r --include="*.log" "error" /var/log/ # Skip binaries
# Parallel grep for many files
find . -name "*.log" -print0 | xargs -0 -P 4 grep "pattern"
|
Common Mistakes#
Forgetting quotes:
1
2
3
4
5
6
| # Bad: shell interprets *
grep error* log.txt
# Good
grep "error*" log.txt
grep 'error*' log.txt
|
Regex vs literal:
1
2
3
4
5
6
7
8
| # This matches "1.0" but also "1x0", "100", etc.
grep "1.0" file.txt
# For literal dot, escape it
grep "1\.0" file.txt
# Or use fixed string
grep -F "1.0" file.txt
|
Grepping binary files:
1
2
3
4
5
6
7
8
| # Binary files can hang or produce garbage
grep "pattern" *
# Text files only
grep -I "pattern" *
# Or specify explicitly
grep --binary-files=without-match "pattern" *
|
Quick Reference#
| Flag | Purpose |
|---|
-i | Case-insensitive |
-n | Show line numbers |
-c | Count matches |
-l | List matching files |
-L | List non-matching files |
-r | Recursive |
-v | Invert match |
-o | Only matching part |
-E | Extended regex |
-P | Perl regex |
-F | Fixed string (literal) |
-A N | N lines after |
-B N | N lines before |
-C N | N lines context |
-m N | Max N matches |
grep is the foundation. Master it, and tools like awk, sed, and find become even more powerful in combination.
Computing Arts is CLI fluency for working developers. More at computingarts.com.