You know grep "error" logfile.txt. But grep can do so much more β€” recursive searches, context lines, inverse matching, and regex patterns that turn hours of manual searching into seconds.

The Basics

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# Search for pattern in file
grep "error" app.log

# Case-insensitive
grep -i "error" app.log

# Show line numbers
grep -n "error" app.log

# Count matches
grep -c "error" app.log

# Only show filenames with matches
grep -l "error" *.log
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# Search all files in directory tree
grep -r "TODO" ./src

# With line numbers
grep -rn "TODO" ./src

# Include only certain files
grep -r --include="*.py" "import os" .

# Exclude directories
grep -r --exclude-dir=node_modules "console.log" .

# Multiple excludes
grep -r --exclude-dir={node_modules,.git,dist} "function" .

Context Lines

When you find a match, you often need surrounding context:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# 3 lines after match
grep -A 3 "ERROR" app.log

# 3 lines before match
grep -B 3 "ERROR" app.log

# 3 lines before and after (context)
grep -C 3 "ERROR" app.log

# Show filename with context
grep -Hn -C 2 "Exception" *.log

Inverse Matching

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Lines NOT containing pattern
grep -v "DEBUG" app.log

# Multiple exclusions (pipe them)
grep -v "DEBUG" app.log | grep -v "INFO"

# Exclude blank lines
grep -v "^$" file.txt

# Exclude comments
grep -v "^#" config.ini

Regular Expressions

Basic Regex (default)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# Word boundary
grep "\berror\b" app.log

# Start of line
grep "^ERROR" app.log

# End of line
grep "failed$" app.log

# Any character
grep "err.r" app.log  # error, err0r, etc.

# Character class
grep "[0-9]" data.txt

Extended Regex (-E)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# One or more
grep -E "er+" file.txt

# Zero or more
grep -E "er*" file.txt

# Optional
grep -E "colou?r" file.txt  # color or colour

# Alternation
grep -E "error|warning|critical" app.log

# Groups
grep -E "(error|warn)ing" app.log

# Quantifiers
grep -E "[0-9]{3}-[0-9]{4}" phones.txt  # 555-1234

Perl Regex (-P)

For advanced patterns, use Perl-compatible regex:

1
2
3
4
5
6
7
8
# Lookahead
grep -P "error(?=.*critical)" app.log

# Non-greedy
grep -oP '".*?"' data.json

# Named groups (extraction)
grep -oP 'user=\K[^\s]+' access.log

The \K resets the match start β€” perfect for extracting values.

Extracting Matches

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Only print matching part
grep -o "error" app.log

# Extract IP addresses
grep -oE "[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+" access.log

# Extract email addresses
grep -oE "[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}" contacts.txt

# Extract URLs
grep -oP 'https?://[^\s]+' page.html

Fixed Strings (Faster)

When you don’t need regex:

1
2
3
4
5
6
7
8
# Literal string (no regex interpretation)
grep -F "user.name" config.json

# Multiple literal patterns
grep -F -e "error" -e "warning" -e "critical" app.log

# From file
grep -Ff patterns.txt data.txt

-F is faster because it skips regex parsing.

Practical Patterns

Log Analysis

1
2
3
4
5
6
7
8
# Errors in last hour (if timestamps are sortable)
grep "$(date +%Y-%m-%d\ %H)" app.log | grep -i error

# Unique error types
grep -oE "ERROR: [^:]+:" app.log | sort -u

# Request paths with 500 errors
grep " 500 " access.log | grep -oE 'GET|POST [^ ]+' 
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Find function definitions
grep -rn "def \|function " --include="*.py" --include="*.js" .

# Find TODO/FIXME
grep -rn "TODO\|FIXME\|XXX" --include="*.py" .

# Find imports of specific module
grep -rn "^import os\|^from os" --include="*.py" .

# Find hardcoded passwords (security audit)
grep -rn "password\s*=\s*['\"]" --include="*.py" --include="*.js" .

System Administration

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Failed SSH logins
grep "Failed password" /var/log/auth.log

# Processes using CPU
ps aux | grep -v grep | grep python

# Services listening on ports
netstat -tlnp | grep LISTEN

# Find large files in df output
df -h | grep -E "[0-9]{2,}G"

JSON/Config Parsing

1
2
3
4
5
6
7
8
# Extract values (simple cases)
grep -oP '"name":\s*"\K[^"]+' data.json

# Find config keys
grep -E "^\s*[a-zA-Z_]+" config.yaml

# Non-comment lines
grep -v "^\s*#" config.ini | grep -v "^$"

Combining with Other Tools

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Count occurrences by type
grep -oE "ERROR|WARN|INFO" app.log | sort | uniq -c

# Find and act on matches
grep -l "deprecated" *.py | xargs -I {} echo "Check: {}"

# Filter then process
grep "user_id" data.json | jq '.user_id'

# Monitor in real-time
tail -f app.log | grep --line-buffered "ERROR"

The --line-buffered flag is crucial for real-time streaming.

Performance Tips

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Use fixed strings when possible
grep -F "literal text" huge.log  # Faster than regex

# Limit context retrieval
grep -m 10 "pattern" huge.log  # Stop after 10 matches

# Use specific file types
grep -r --include="*.log" "error" /var/log/  # Skip binaries

# Parallel grep for many files
find . -name "*.log" -print0 | xargs -0 -P 4 grep "pattern"

Common Mistakes

Forgetting quotes:

1
2
3
4
5
6
# Bad: shell interprets *
grep error* log.txt

# Good
grep "error*" log.txt
grep 'error*' log.txt

Regex vs literal:

1
2
3
4
5
6
7
8
# This matches "1.0" but also "1x0", "100", etc.
grep "1.0" file.txt

# For literal dot, escape it
grep "1\.0" file.txt

# Or use fixed string
grep -F "1.0" file.txt

Grepping binary files:

1
2
3
4
5
6
7
8
# Binary files can hang or produce garbage
grep "pattern" *

# Text files only
grep -I "pattern" *

# Or specify explicitly
grep --binary-files=without-match "pattern" *

Quick Reference

FlagPurpose
-iCase-insensitive
-nShow line numbers
-cCount matches
-lList matching files
-LList non-matching files
-rRecursive
-vInvert match
-oOnly matching part
-EExtended regex
-PPerl regex
-FFixed string (literal)
-A NN lines after
-B NN lines before
-C NN lines context
-m NMax N matches

grep is the foundation. Master it, and tools like awk, sed, and find become even more powerful in combination.


Computing Arts is CLI fluency for working developers. More at computingarts.com.