Cron is deceptively simple. Five fields, a command, done. Until your job runs twice simultaneously, silently fails for a week, or fills your disk with output nobody reads.
Here’s how to write cron jobs that actually work in production.
The Basics Done Right#
1
2
3
4
5
6
7
8
| # Bad: No logging, no error handling
0 * * * * /opt/scripts/backup.sh
# Better: Redirect output, capture errors
0 * * * * /opt/scripts/backup.sh >> /var/log/backup.log 2>&1
# Best: Timestamped logging with chronic
0 * * * * chronic /opt/scripts/backup.sh
|
chronic (from moreutils) only outputs when the command fails. Perfect for cron — silent success, loud failure.
Use Absolute Paths Everywhere#
Cron runs with a minimal environment. PATH is usually just /usr/bin:/bin.
1
2
3
4
5
6
7
| # Inside your script
#!/usr/bin/env bash
export PATH="/usr/local/bin:/usr/bin:/bin"
# Or be explicit
/usr/bin/python3 /opt/scripts/process.py
/usr/local/bin/aws s3 sync ...
|
Never assume a command is in PATH. Either set PATH explicitly or use absolute paths.
Lock Files Prevent Overlaps#
If your job takes longer than its interval, you’ll have multiple instances fighting each other:
1
2
3
4
5
6
7
8
9
10
11
| #!/usr/bin/env bash
LOCK_FILE="/var/run/backup.lock"
# Use flock for atomic locking
exec 200>"$LOCK_FILE"
flock -n 200 || { echo "Already running"; exit 1; }
# Your actual work here
/opt/scripts/backup.sh
# Lock automatically released when script exits
|
Or use flock directly in crontab:
1
| 0 * * * * flock -n /var/run/backup.lock /opt/scripts/backup.sh
|
The -n flag means “don’t wait” — if locked, exit immediately.
Proper Error Handling#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
| #!/usr/bin/env bash
set -euo pipefail
LOG_FILE="/var/log/myjob.log"
ALERT_EMAIL="ops@example.com"
log() {
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $*" >> "$LOG_FILE"
}
alert() {
log "ALERT: $*"
echo "$*" | mail -s "Cron Alert: $(hostname)" "$ALERT_EMAIL"
}
cleanup() {
local exit_code=$?
if [[ $exit_code -ne 0 ]]; then
alert "Job failed with exit code $exit_code"
fi
rm -f "$TEMP_FILE"
}
trap cleanup EXIT
log "Starting job"
# Your work here
log "Job completed"
|
Key points:
set -euo pipefail catches most silent failures- Trap on EXIT for cleanup regardless of how the script ends
- Alert on non-zero exit codes
Environment Variables#
Cron doesn’t load your shell profile. Define what you need:
1
2
3
4
5
6
7
8
| # Set at the top of crontab
SHELL=/bin/bash
PATH=/usr/local/bin:/usr/bin:/bin
MAILTO=ops@example.com
HOME=/home/deploy
# Or in the script itself
0 * * * * . /home/deploy/.env && /opt/scripts/job.sh
|
For secrets, source an env file rather than putting credentials in crontab (which is readable by the user).
Logging That’s Actually Useful#
1
2
3
4
5
6
7
8
9
10
11
12
| #!/usr/bin/env bash
exec > >(tee -a /var/log/myjob.log) 2>&1
echo "=== $(date '+%Y-%m-%d %H:%M:%S') Starting ==="
echo "Environment: $(hostname), user: $(whoami)"
echo "Working directory: $(pwd)"
# Your work with natural echo statements
echo "Processing files..."
process_files
echo "=== $(date '+%Y-%m-%d %H:%M:%S') Completed ==="
|
The exec > >(tee ...) trick redirects all output to both stdout and a log file. Every command’s output gets captured.
Log Rotation#
Without rotation, logs grow forever:
1
2
3
4
5
6
7
8
9
10
| # /etc/logrotate.d/myjob
/var/log/myjob.log {
daily
rotate 7
compress
delaycompress
missingok
notifempty
create 644 root root
}
|
Or use logrotate inline:
1
2
3
4
5
| # Rotate if over 10MB
if [[ -f "$LOG_FILE" && $(stat -f%z "$LOG_FILE" 2>/dev/null || stat -c%s "$LOG_FILE") -gt 10485760 ]]; then
mv "$LOG_FILE" "$LOG_FILE.1"
gzip "$LOG_FILE.1"
fi
|
Random Delays (Avoid Thundering Herds)#
If 100 servers all run a job at midnight, your central service gets hammered:
1
2
| # Add random delay up to 5 minutes
0 0 * * * sleep $((RANDOM \% 300)) && /opt/scripts/sync.sh
|
Or use systemd timers with RandomizedDelaySec:
1
2
3
| [Timer]
OnCalendar=daily
RandomizedDelaySec=300
|
Monitoring Cron Jobs#
Silence is the enemy. If a job fails silently, you won’t know until it’s too late.
Option 1: Dead man’s switch
Services like Healthchecks.io or Cronitor give you a URL to ping on success:
1
| /opt/scripts/backup.sh && curl -fsS -m 10 --retry 5 https://hc-ping.com/your-uuid
|
If the ping doesn’t arrive on schedule, you get alerted.
Option 2: Metrics
Push job results to Prometheus/InfluxDB:
1
2
3
| END_TIME=$(date +%s)
DURATION=$((END_TIME - START_TIME))
echo "cron_job_duration_seconds{job=\"backup\"} $DURATION" | curl --data-binary @- http://pushgateway:9091/metrics/job/cron
|
Option 3: Structured logs to aggregator
1
| echo '{"job":"backup","status":"success","duration_s":'$DURATION'}' | logger -t cron-jobs
|
Then alert on missing success logs or elevated error rates.
Use Systemd Timers for Complex Jobs#
For anything beyond simple scripts, systemd timers beat cron:
1
2
3
4
5
6
7
8
9
10
11
| # /etc/systemd/system/backup.timer
[Unit]
Description=Daily backup timer
[Timer]
OnCalendar=*-*-* 02:00:00
RandomizedDelaySec=300
Persistent=true
[Install]
WantedBy=timers.target
|
1
2
3
4
5
6
7
8
9
10
| # /etc/systemd/system/backup.service
[Unit]
Description=Backup job
[Service]
Type=oneshot
ExecStart=/opt/scripts/backup.sh
User=backup
StandardOutput=journal
StandardError=journal
|
Benefits over cron:
- Dependencies (
After=, Requires=) - Resource limits (
MemoryMax=, CPUQuota=) - Better logging (journald integration)
Persistent=true runs missed jobs after reboot
Quick Checklist#
Before deploying any cron job:
Cron jobs are infrastructure. Treat them with the same rigor as your application code.
Computing Arts is where reliability meets automation. More at computingarts.com.