You have a list of files. You need to process each one. The naive approach:

1
2
3
for file in $(cat files.txt); do
    process "$file"
done

This works until it doesn’t — filenames with spaces break it, and it’s sequential. Enter xargs.

The Basics

xargs reads input and converts it into arguments for a command:

1
2
3
4
5
# Delete files listed in a file
cat files.txt | xargs rm

# Same thing, more efficient
xargs rm < files.txt

Without xargs, you’d need a loop. With xargs, one line.

Handling Special Characters

Filenames have spaces? Newlines? Use null delimiters:

1
2
3
4
5
# Find + xargs with null separator
find . -name "*.log" -print0 | xargs -0 rm

# Read null-terminated input
cat files.txt | tr '\n' '\0' | xargs -0 process

The -print0 and -0 combo is bulletproof. Always use it with find.

Controlling Argument Placement

By default, xargs appends arguments at the end. Use -I to place them anywhere:

1
2
3
4
5
6
7
8
# Rename files with a prefix
ls *.txt | xargs -I {} mv {} backup_{}

# Copy files to a directory
find . -name "*.conf" | xargs -I {} cp {} /backup/configs/

# Use with curl
cat urls.txt | xargs -I {} curl -O {}

The {} is a placeholder — each input line replaces it.

Batch Size Control

Process arguments in batches with -n:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# Delete 10 files at a time
find . -name "*.tmp" -print0 | xargs -0 -n 10 rm

# Echo shows the batching
echo {1..20} | xargs -n 5 echo
# Output:
# 1 2 3 4 5
# 6 7 8 9 10
# 11 12 13 14 15
# 16 17 18 19 20

Useful when commands have argument limits or you want progress visibility.

Parallel Execution

This is where xargs shines. Use -P for parallel processes:

1
2
3
4
5
6
7
8
# Process 4 files simultaneously
find . -name "*.jpg" -print0 | xargs -0 -P 4 -I {} convert {} -resize 50% resized_{}

# Download URLs in parallel
cat urls.txt | xargs -P 8 -I {} curl -sO {}

# Compress files with all cores
find . -name "*.log" -print0 | xargs -0 -P $(nproc) gzip

-P 0 means unlimited parallelism (use with caution).

Confirmation Before Execution

Ask before each command with -p:

1
2
3
# Confirm each deletion
find . -name "*.bak" | xargs -p rm
# rm ./old.bak?...y

For dry runs, use echo:

1
2
# See what would run
find . -name "*.tmp" | xargs echo rm

Combining with Other Tools

With grep

1
2
3
4
5
# Find files containing "TODO" and show matches
find . -name "*.py" | xargs grep -l "TODO"

# Count TODOs per file
find . -name "*.py" -print0 | xargs -0 grep -c "TODO" | grep -v ":0$"

With sed

1
2
# Replace text in multiple files
find . -name "*.txt" | xargs sed -i 's/old/new/g'

With ssh

1
2
# Run command on multiple hosts
echo "host1 host2 host3" | tr ' ' '\n' | xargs -P 3 -I {} ssh {} "uptime"

With docker

1
2
3
4
5
# Stop all running containers
docker ps -q | xargs docker stop

# Remove old images
docker images -q --filter "dangling=true" | xargs docker rmi

Building Complex Pipelines

1
2
3
4
5
6
7
# Find large files, sort by size, take top 10, show details
find /var/log -type f -size +10M -print0 | \
    xargs -0 ls -lhS | \
    head -10

# Process CSV: extract column, dedupe, count occurrences
cut -d',' -f2 data.csv | sort | uniq | xargs -I {} sh -c 'echo -n "{}: "; grep -c "{}" data.csv'

Handling Command Failures

By default, xargs continues after failures. Change this:

1
2
# Stop on first failure
find . -name "*.sh" -print0 | xargs -0 --halt-on-error=1 bash -c 'shellcheck "$@"' _

Or capture exit codes:

1
2
# Process and track failures
find . -name "*.test" -print0 | xargs -0 -I {} sh -c './run_test {} || echo "FAILED: {}"'

Performance Tips

Reduce Process Spawning

1
2
3
4
5
# Bad: spawns 'echo' for each file
find . -name "*.txt" | xargs -n 1 echo "Processing:"

# Good: batches into fewer commands
find . -name "*.txt" | xargs echo "Processing:"

Limit Line Length

Some systems have argument length limits. Use -s to set max command length:

1
find . -name "*.log" | xargs -s 10000 rm

Use Built-in Parallelism

Many commands have their own parallel options:

1
2
3
4
5
# Instead of: find ... | xargs -P 4 gzip
# Use: find ... | xargs gzip --parallel=4

# Or pigz for better parallel gzip
find . -name "*.log" -print0 | xargs -0 pigz -p 4

Common Patterns

Backup Before Modify

1
find . -name "*.conf" -print0 | xargs -0 -I {} sh -c 'cp {} {}.bak && process {}'

Process With Index

1
find . -name "*.jpg" | nl | xargs -n 2 sh -c 'mv "$2" "image_$1.jpg"' _

Conditional Execution

1
2
# Only process if target doesn't exist
find . -name "*.md" | xargs -I {} sh -c '[ ! -f "{}.html" ] && pandoc {} -o {}.html'

Quick Reference

FlagPurposeExample
-0Null-delimited inputfind -print0 | xargs -0
-I {}Placeholder for argumentsxargs -I {} mv {} /dest/
-n NN arguments per commandxargs -n 2
-P NN parallel processesxargs -P 4
-pPrompt before executionxargs -p rm
-tPrint commands before runningxargs -t
-rDon’t run if input is emptyxargs -r rm

When Not to Use xargs

  • Simple cases: find . -name "*.tmp" -delete beats find | xargs rm
  • Complex logic: Use a proper script instead of sh -c chains
  • Already parallel tools: parallel (GNU Parallel) is more powerful for complex parallelism

xargs is for the sweet spot: batch operations where a loop is too slow and a full script is overkill.


Computing Arts is CLI craft for the modern practitioner. More at computingarts.com.