Cron Jobs That Don't Wake You Up at Night

Cron is deceptively simple. Five fields, a command, done. Until your job runs twice simultaneously, silently fails for a week, or fills your disk with output nobody reads.

Here’s how to write cron jobs that actually work in production.

The Basics Done Right

1
2
3
4
5
6
7
8
# Bad: No logging, no error handling
0 * * * * /opt/scripts/backup.sh

# Better: Redirect output, capture errors
0 * * * * /opt/scripts/backup.sh >> /var/log/backup.log 2>&1

# Best: Timestamped logging with chronic
0 * * * * chronic /opt/scripts/backup.sh

chronic (from moreutils) only outputs when the command fails. Perfect for cron — silent success, loud failure.

Use Absolute Paths Everywhere

Cron runs with a minimal environment. PATH is usually just /usr/bin:/bin.

1
2
3
4
5
6
7
# Inside your script
#!/usr/bin/env bash
export PATH="/usr/local/bin:/usr/bin:/bin"

# Or be explicit
/usr/bin/python3 /opt/scripts/process.py
/usr/local/bin/aws s3 sync ...

Never assume a command is in PATH. Either set PATH explicitly or use absolute paths.

Lock Files Prevent Overlaps

If your job takes longer than its interval, you’ll have multiple instances fighting each other:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
#!/usr/bin/env bash
LOCK_FILE="/var/run/backup.lock"

# Use flock for atomic locking
exec 200>"$LOCK_FILE"
flock -n 200 || { echo "Already running"; exit 1; }

# Your actual work here
/opt/scripts/backup.sh

# Lock automatically released when script exits

Or use flock directly in crontab:

1
0 * * * * flock -n /var/run/backup.lock /opt/scripts/backup.sh

The -n flag means “don’t wait” — if locked, exit immediately.

Proper Error Handling

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#!/usr/bin/env bash
set -euo pipefail

LOG_FILE="/var/log/myjob.log"
ALERT_EMAIL="ops@example.com"

log() {
    echo "[$(date '+%Y-%m-%d %H:%M:%S')] $*" >> "$LOG_FILE"
}

alert() {
    log "ALERT: $*"
    echo "$*" | mail -s "Cron Alert: $(hostname)" "$ALERT_EMAIL"
}

cleanup() {
    local exit_code=$?
    if [[ $exit_code -ne 0 ]]; then
        alert "Job failed with exit code $exit_code"
    fi
    rm -f "$TEMP_FILE"
}

trap cleanup EXIT

log "Starting job"
# Your work here
log "Job completed"

Key points:

set -euo pipefail catches most silent failures
Trap on EXIT for cleanup regardless of how the script ends
Alert on non-zero exit codes

Environment Variables

Cron doesn’t load your shell profile. Define what you need:

1
2
3
4
5
6
7
8
# Set at the top of crontab
SHELL=/bin/bash
PATH=/usr/local/bin:/usr/bin:/bin
MAILTO=ops@example.com
HOME=/home/deploy

# Or in the script itself
0 * * * * . /home/deploy/.env && /opt/scripts/job.sh

For secrets, source an env file rather than putting credentials in crontab (which is readable by the user).

Logging That’s Actually Useful

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
#!/usr/bin/env bash
exec > >(tee -a /var/log/myjob.log) 2>&1

echo "=== $(date '+%Y-%m-%d %H:%M:%S') Starting ==="
echo "Environment: $(hostname), user: $(whoami)"
echo "Working directory: $(pwd)"

# Your work with natural echo statements
echo "Processing files..."
process_files

echo "=== $(date '+%Y-%m-%d %H:%M:%S') Completed ==="

The exec > >(tee ...) trick redirects all output to both stdout and a log file. Every command’s output gets captured.

Log Rotation

Without rotation, logs grow forever:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# /etc/logrotate.d/myjob
/var/log/myjob.log {
    daily
    rotate 7
    compress
    delaycompress
    missingok
    notifempty
    create 644 root root
}

Or use logrotate inline:

1
2
3
4
5
# Rotate if over 10MB
if [[ -f "$LOG_FILE" && $(stat -f%z "$LOG_FILE" 2>/dev/null || stat -c%s "$LOG_FILE") -gt 10485760 ]]; then
    mv "$LOG_FILE" "$LOG_FILE.1"
    gzip "$LOG_FILE.1"
fi

Random Delays (Avoid Thundering Herds)

If 100 servers all run a job at midnight, your central service gets hammered:

1
2
# Add random delay up to 5 minutes
0 0 * * * sleep $((RANDOM \% 300)) && /opt/scripts/sync.sh

Or use systemd timers with RandomizedDelaySec:

1
2
3
[Timer]
OnCalendar=daily
RandomizedDelaySec=300

Monitoring Cron Jobs

Silence is the enemy. If a job fails silently, you won’t know until it’s too late.

Option 1: Dead man’s switch

Services like Healthchecks.io or Cronitor give you a URL to ping on success:

1
/opt/scripts/backup.sh && curl -fsS -m 10 --retry 5 https://hc-ping.com/your-uuid

If the ping doesn’t arrive on schedule, you get alerted.

Option 2: Metrics

Push job results to Prometheus/InfluxDB:

1
2
3
END_TIME=$(date +%s)
DURATION=$((END_TIME - START_TIME))
echo "cron_job_duration_seconds{job=\"backup\"} $DURATION" | curl --data-binary @- http://pushgateway:9091/metrics/job/cron

Option 3: Structured logs to aggregator

1
echo '{"job":"backup","status":"success","duration_s":'$DURATION'}' | logger -t cron-jobs

Then alert on missing success logs or elevated error rates.

Use Systemd Timers for Complex Jobs

For anything beyond simple scripts, systemd timers beat cron:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# /etc/systemd/system/backup.timer
[Unit]
Description=Daily backup timer

[Timer]
OnCalendar=*-*-* 02:00:00
RandomizedDelaySec=300
Persistent=true

[Install]
WantedBy=timers.target

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# /etc/systemd/system/backup.service
[Unit]
Description=Backup job

[Service]
Type=oneshot
ExecStart=/opt/scripts/backup.sh
User=backup
StandardOutput=journal
StandardError=journal

Benefits over cron:

Dependencies (After=, Requires=)
Resource limits (MemoryMax=, CPUQuota=)
Better logging (journald integration)
Persistent=true runs missed jobs after reboot

Quick Checklist

Before deploying any cron job:

Absolute paths for all commands
Lock file to prevent overlaps
Proper error handling (set -euo pipefail)
Output redirected to logs
Log rotation configured
Monitoring/alerting on failure
Tested manually first (sudo -u cronuser /path/to/script.sh)
Environment variables documented

Cron jobs are infrastructure. Treat them with the same rigor as your application code.

Computing Arts is where reliability meets automation. More at computingarts.com.

The Basics Done Right#

Use Absolute Paths Everywhere#

Lock Files Prevent Overlaps#

Proper Error Handling#

Environment Variables#

Logging That’s Actually Useful#

Log Rotation#

Random Delays (Avoid Thundering Herds)#

Monitoring Cron Jobs#

Use Systemd Timers for Complex Jobs#

Quick Checklist#

📬 Get the Newsletter