Shell Scripting Patterns That Prevent 3 AM Pages

Every ops engineer has a story about a shell script that worked perfectly — until it didn’t. Usually at 3 AM. Usually in production. These patterns won’t make your scripts bulletproof, but they’ll stop the most common failures. Start Every Script Right 1 2 3 #!/usr/bin/env bash set -euo pipefail IFS=$'\n\t' This preamble should be muscle memory: set -e: Exit on any error (non-zero return code) set -u: Exit on undefined variables set -o pipefail: Catch errors in pipes (not just the last command) IFS=$'\n\t': Safer word splitting (no spaces) Without these, a typo like rm -rf $UNSET_VAR/ could wipe your root filesystem. ...

February 27, 2026 · 5 min · 1001 words · Rob Washington

Bash Scripting Patterns That Prevent Disasters

Bash scripts have a reputation for being fragile. They don’t have to be. Here are the patterns that separate scripts that work from scripts that work reliably. Start Every Script Right 1 2 3 #!/usr/bin/env bash set -euo pipefail IFS=$'\n\t' What each does: set -e - Exit on any command failure set -u - Error on undefined variables set -o pipefail - Pipelines fail if any command fails IFS=$'\n\t' - Safer word splitting (no space splitting) Error Handling Basic Trap 1 2 3 4 5 6 7 8 9 10 11 12 #!/usr/bin/env bash set -euo pipefail cleanup() { echo "Cleaning up..." rm -f "$TEMP_FILE" } trap cleanup EXIT TEMP_FILE=$(mktemp) # Script continues... # cleanup runs automatically on exit, error, or interrupt Detailed Error Reporting 1 2 3 4 5 6 7 8 9 10 11 12 #!/usr/bin/env bash set -euo pipefail error_handler() { local line=$1 local exit_code=$2 echo "Error on line $line: exit code $exit_code" >&2 exit "$exit_code" } trap 'error_handler $LINENO $?' ERR # Now errors report their line number Log and Exit 1 2 3 4 5 6 7 die() { echo "ERROR: $*" >&2 exit 1 } # Usage [[ -f "$CONFIG_FILE" ]] || die "Config file not found: $CONFIG_FILE" Argument Parsing Simple Positional 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 #!/usr/bin/env bash set -euo pipefail usage() { echo "Usage: $0 <environment> <version>" echo " environment: staging|production" echo " version: semver (e.g., 1.2.3)" exit 1 } [[ $# -eq 2 ]] || usage ENVIRONMENT=$1 VERSION=$2 [[ "$ENVIRONMENT" =~ ^(staging|production)$ ]] || die "Invalid environment" [[ "$VERSION" =~ ^[0-9]+\.[0-9]+\.[0-9]+$ ]] || die "Invalid version format" Flags with getopts 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 #!/usr/bin/env bash set -euo pipefail VERBOSE=false DRY_RUN=false OUTPUT="" usage() { cat <<EOF Usage: $0 [options] <file> Options: -v Verbose output -n Dry run (don't make changes) -o FILE Output file -h Show this help EOF exit 1 } while getopts "vno:h" opt; do case $opt in v) VERBOSE=true ;; n) DRY_RUN=true ;; o) OUTPUT=$OPTARG ;; h) usage ;; *) usage ;; esac done shift $((OPTIND - 1)) [[ $# -eq 1 ]] || usage FILE=$1 Long Options (Manual Parsing) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 #!/usr/bin/env bash set -euo pipefail VERBOSE=false CONFIG="" while [[ $# -gt 0 ]]; do case $1 in -v|--verbose) VERBOSE=true shift ;; -c|--config) CONFIG=$2 shift 2 ;; -h|--help) usage ;; --) shift break ;; -*) die "Unknown option: $1" ;; *) break ;; esac done Variable Safety Default Values 1 2 3 4 5 6 7 8 # Default if unset NAME=${NAME:-"default"} # Default if unset or empty NAME=${NAME:="default"} # Error if unset : "${REQUIRED_VAR:?'REQUIRED_VAR must be set'}" Safe Variable Expansion 1 2 3 4 5 6 7 8 # Always quote variables rm "$FILE" # Good rm $FILE # Bad - breaks on spaces # Check before using if [[ -n "${VAR:-}" ]]; then echo "$VAR" fi Arrays 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 FILES=("file1.txt" "file with spaces.txt" "file3.txt") # Iterate safely for file in "${FILES[@]}"; do echo "Processing: $file" done # Pass to commands cp "${FILES[@]}" /destination/ # Length echo "Count: ${#FILES[@]}" # Append FILES+=("another.txt") File Operations Safe Temporary Files 1 2 3 4 5 TEMP_DIR=$(mktemp -d) trap 'rm -rf "$TEMP_DIR"' EXIT TEMP_FILE=$(mktemp) trap 'rm -f "$TEMP_FILE"' EXIT Check Before Acting 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # File exists [[ -f "$FILE" ]] || die "File not found: $FILE" # Directory exists [[ -d "$DIR" ]] || die "Directory not found: $DIR" # File is readable [[ -r "$FILE" ]] || die "Cannot read: $FILE" # File is writable [[ -w "$FILE" ]] || die "Cannot write: $FILE" # File is executable [[ -x "$FILE" ]] || die "Cannot execute: $FILE" Safe File Writing 1 2 3 4 5 6 7 8 9 10 # Atomic write with temp file write_config() { local content=$1 local dest=$2 local temp temp=$(mktemp) echo "$content" > "$temp" mv "$temp" "$dest" # Atomic on same filesystem } Command Execution Check Command Exists 1 2 3 4 5 6 7 require_command() { command -v "$1" >/dev/null 2>&1 || die "Required command not found: $1" } require_command jq require_command aws require_command docker Capture Output and Exit Code 1 2 3 4 5 6 7 8 9 10 11 # Capture output output=$(some_command 2>&1) # Capture exit code without exiting (despite set -e) exit_code=0 output=$(some_command 2>&1) || exit_code=$? if [[ $exit_code -ne 0 ]]; then echo "Command failed with code $exit_code" echo "Output: $output" fi Retry Logic 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 retry() { local max_attempts=$1 local delay=$2 shift 2 local cmd=("$@") local attempt=1 while [[ $attempt -le $max_attempts ]]; do if "${cmd[@]}"; then return 0 fi echo "Attempt $attempt/$max_attempts failed. Retrying in ${delay}s..." sleep "$delay" ((attempt++)) done return 1 } # Usage retry 3 5 curl -f https://api.example.com/health Logging 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 LOG_FILE="/var/log/myscript.log" log() { local level=$1 shift echo "[$(date '+%Y-%m-%d %H:%M:%S')] [$level] $*" | tee -a "$LOG_FILE" } info() { log INFO "$@"; } warn() { log WARN "$@"; } error() { log ERROR "$@" >&2; } # Usage info "Starting deployment" warn "Deprecated config option used" error "Failed to connect to database" Confirmation Prompts 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 confirm() { local prompt=${1:-"Continue?"} local response read -r -p "$prompt [y/N] " response [[ "$response" =~ ^[Yy]$ ]] } # Usage if confirm "Delete all files in $DIR?"; then rm -rf "$DIR"/* fi # With default yes confirm_yes() { local prompt=${1:-"Continue?"} local response read -r -p "$prompt [Y/n] " response [[ ! "$response" =~ ^[Nn]$ ]] } Parallel Execution 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 # Simple background jobs for server in server1 server2 server3; do deploy_to "$server" & done wait # Wait for all background jobs # With job limiting MAX_JOBS=4 job_count=0 for item in "${ITEMS[@]}"; do process_item "$item" & ((job_count++)) if [[ $job_count -ge $MAX_JOBS ]]; then wait -n # Wait for any one job to complete ((job_count--)) fi done wait # Wait for remaining jobs Complete Script Template 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 #!/usr/bin/env bash # # Description: What this script does # Usage: ./script.sh [options] <args> # set -euo pipefail IFS=$'\n\t' # Constants readonly SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" readonly SCRIPT_NAME="$(basename "${BASH_SOURCE[0]}")" # Defaults VERBOSE=false DRY_RUN=false # Logging log() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] $*"; } info() { log "INFO: $*"; } warn() { log "WARN: $*" >&2; } error() { log "ERROR: $*" >&2; } die() { error "$@"; exit 1; } # Cleanup cleanup() { # Add cleanup tasks here : } trap cleanup EXIT # Usage usage() { cat <<EOF Usage: $SCRIPT_NAME [options] <argument> Description: What this script does in more detail. Options: -v, --verbose Enable verbose output -n, --dry-run Show what would be done -h, --help Show this help message Examples: $SCRIPT_NAME -v input.txt $SCRIPT_NAME --dry-run config.yml EOF exit "${1:-0}" } # Parse arguments parse_args() { while [[ $# -gt 0 ]]; do case $1 in -v|--verbose) VERBOSE=true; shift ;; -n|--dry-run) DRY_RUN=true; shift ;; -h|--help) usage 0 ;; --) shift; break ;; -*) die "Unknown option: $1" ;; *) break ;; esac done [[ $# -ge 1 ]] || die "Missing required argument" ARGUMENT=$1 } # Main logic main() { parse_args "$@" info "Starting with argument: $ARGUMENT" if $VERBOSE; then info "Verbose mode enabled" fi if $DRY_RUN; then info "Dry run - no changes will be made" fi # Your script logic here info "Done" } main "$@" Bash scripts don’t need to be fragile. set -euo pipefail catches most accidents. Proper argument parsing makes scripts usable. Traps ensure cleanup happens. These patterns transform one-off hacks into reliable automation. Use the template, adapt as needed, and stop being afraid of your own scripts. ...

February 26, 2026 · 7 min · 1491 words · Rob Washington

Bash Scripting Patterns: Write Scripts That Don't Embarrass You

Bash scripts have a reputation for being fragile, unreadable hacks. They don’t have to be. These patterns will help you write scripts that are maintainable, debuggable, and safe to run in production. The Essentials: Start Every Script Right 1 2 3 4 #!/usr/bin/env bash set -euo pipefail IFS=$'\n\t' What these do: set -e — Exit on any error set -u — Error on undefined variables set -o pipefail — Catch errors in pipelines IFS=$'\n\t' — Safer word splitting (no spaces) Script Template 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 #!/usr/bin/env bash set -euo pipefail # Script metadata readonly SCRIPT_NAME="$(basename "$0")" readonly SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" readonly SCRIPT_VERSION="1.0.0" # Default values VERBOSE=false DRY_RUN=false CONFIG_FILE="" # Colors (if terminal supports them) if [[ -t 1 ]]; then readonly RED='\033[0;31m' readonly GREEN='\033[0;32m' readonly YELLOW='\033[0;33m' readonly NC='\033[0m' # No Color else readonly RED='' GREEN='' YELLOW='' NC='' fi # Logging functions log_info() { echo -e "${GREEN}[INFO]${NC} $*"; } log_warn() { echo -e "${YELLOW}[WARN]${NC} $*" >&2; } log_error() { echo -e "${RED}[ERROR]${NC} $*" >&2; } log_debug() { [[ "$VERBOSE" == true ]] && echo -e "[DEBUG] $*" || true; } # Cleanup function cleanup() { local exit_code=$? # Add cleanup tasks here log_debug "Cleaning up..." exit $exit_code } trap cleanup EXIT # Usage usage() { cat <<EOF Usage: $SCRIPT_NAME [OPTIONS] <argument> Description of what this script does. Options: -h, --help Show this help message -v, --verbose Enable verbose output -n, --dry-run Show what would be done -c, --config FILE Path to config file --version Show version Examples: $SCRIPT_NAME -v input.txt $SCRIPT_NAME --config /etc/myapp.conf data/ EOF } # Parse arguments parse_args() { while [[ $# -gt 0 ]]; do case $1 in -h|--help) usage exit 0 ;; -v|--verbose) VERBOSE=true shift ;; -n|--dry-run) DRY_RUN=true shift ;; -c|--config) CONFIG_FILE="$2" shift 2 ;; --version) echo "$SCRIPT_NAME version $SCRIPT_VERSION" exit 0 ;; -*) log_error "Unknown option: $1" usage exit 1 ;; *) ARGS+=("$1") shift ;; esac done } # Validation validate() { if [[ ${#ARGS[@]} -eq 0 ]]; then log_error "Missing required argument" usage exit 1 fi if [[ -n "$CONFIG_FILE" && ! -f "$CONFIG_FILE" ]]; then log_error "Config file not found: $CONFIG_FILE" exit 1 fi } # Main logic main() { log_info "Starting $SCRIPT_NAME" log_debug "Arguments: ${ARGS[*]}" for arg in "${ARGS[@]}"; do if [[ "$DRY_RUN" == true ]]; then log_info "[DRY RUN] Would process: $arg" else log_info "Processing: $arg" # Actual work here fi done log_info "Done" } # Entry point ARGS=() parse_args "$@" validate main Error Handling Check Command Existence 1 2 3 4 5 6 7 8 9 10 require_command() { if ! command -v "$1" &> /dev/null; then log_error "Required command not found: $1" exit 1 fi } require_command curl require_command jq require_command docker Retry Pattern 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 retry() { local max_attempts=$1 local delay=$2 shift 2 local cmd=("$@") local attempt=1 while (( attempt <= max_attempts )); do if "${cmd[@]}"; then return 0 fi log_warn "Attempt $attempt/$max_attempts failed, retrying in ${delay}s..." sleep "$delay" ((attempt++)) done log_error "Command failed after $max_attempts attempts: ${cmd[*]}" return 1 } # Usage retry 3 5 curl -sf https://api.example.com/health Timeout Pattern 1 2 3 4 5 6 7 8 9 10 11 12 13 14 with_timeout() { local timeout=$1 shift timeout "$timeout" "$@" || { local exit_code=$? if [[ $exit_code -eq 124 ]]; then log_error "Command timed out after ${timeout}s" fi return $exit_code } } # Usage with_timeout 30 curl -s https://api.example.com Safe File Operations Temporary Files 1 2 3 4 5 6 7 # Create temp file that's automatically cleaned up TEMP_FILE=$(mktemp) trap 'rm -f "$TEMP_FILE"' EXIT # Or temp directory TEMP_DIR=$(mktemp -d) trap 'rm -rf "$TEMP_DIR"' EXIT Safe Write (Atomic) 1 2 3 4 5 6 7 8 9 10 11 12 safe_write() { local target=$1 local content=$2 local temp_file temp_file=$(mktemp) echo "$content" > "$temp_file" mv "$temp_file" "$target" } # Usage safe_write /etc/myapp/config.json '{"key": "value"}' Backup Before Modify 1 2 3 4 5 6 7 8 9 backup_file() { local file=$1 local backup="${file}.bak.$(date +%Y%m%d_%H%M%S)" if [[ -f "$file" ]]; then cp "$file" "$backup" log_info "Backup created: $backup" fi } Input Validation File Checks 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 validate_file() { local file=$1 [[ -e "$file" ]] || { log_error "File not found: $file"; return 1; } [[ -f "$file" ]] || { log_error "Not a regular file: $file"; return 1; } [[ -r "$file" ]] || { log_error "File not readable: $file"; return 1; } } validate_directory() { local dir=$1 [[ -e "$dir" ]] || { log_error "Directory not found: $dir"; return 1; } [[ -d "$dir" ]] || { log_error "Not a directory: $dir"; return 1; } [[ -w "$dir" ]] || { log_error "Directory not writable: $dir"; return 1; } } Input Sanitization 1 2 3 4 5 6 7 8 sanitize_filename() { local input=$1 # Remove path components and dangerous characters echo "${input##*/}" | tr -cd '[:alnum:]._-' } # Usage safe_name=$(sanitize_filename "$user_input") Configuration Config File Loading 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 load_config() { local config_file=$1 if [[ -f "$config_file" ]]; then log_debug "Loading config: $config_file" # shellcheck source=/dev/null source "$config_file" else log_warn "Config file not found, using defaults: $config_file" fi } # Config file format (config.sh): # DB_HOST="localhost" # DB_PORT=5432 # ENABLE_FEATURE=true Environment Variable Defaults 1 2 3 4 5 6 : "${DB_HOST:=localhost}" : "${DB_PORT:=5432}" : "${LOG_LEVEL:=info}" # Or with validation DB_HOST="${DB_HOST:?DB_HOST environment variable required}" Parallel Execution Using xargs 1 2 3 4 process_files() { find . -name "*.txt" -print0 | \ xargs -0 -P 4 -I {} bash -c 'process_single_file "$1"' _ {} } Using Background Jobs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 parallel_tasks() { local pids=() for item in "${items[@]}"; do process_item "$item" & pids+=($!) done # Wait for all and check results local failed=0 for pid in "${pids[@]}"; do if ! wait "$pid"; then ((failed++)) fi done [[ $failed -eq 0 ]] || log_error "$failed tasks failed" return $failed } Locking Prevent Concurrent Runs 1 2 3 4 5 6 7 8 9 10 11 LOCK_FILE="/var/run/${SCRIPT_NAME}.lock" acquire_lock() { if ! mkdir "$LOCK_FILE" 2>/dev/null; then log_error "Another instance is running (lock: $LOCK_FILE)" exit 1 fi trap 'rm -rf "$LOCK_FILE"' EXIT } acquire_lock With flock 1 2 3 4 5 6 7 8 9 10 11 12 LOCK_FD=200 LOCK_FILE="/var/run/${SCRIPT_NAME}.lock" acquire_lock() { eval "exec $LOCK_FD>$LOCK_FILE" if ! flock -n $LOCK_FD; then log_error "Another instance is running" exit 1 fi } acquire_lock Testing Assertions 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 assert_equals() { local expected=$1 actual=$2 message=${3:-"Assertion failed"} if [[ "$expected" != "$actual" ]]; then log_error "$message: expected '$expected', got '$actual'" return 1 fi } assert_file_exists() { local file=$1 if [[ ! -f "$file" ]]; then log_error "File should exist: $file" return 1 fi } # Usage assert_equals "200" "$status_code" "HTTP status check" assert_file_exists "/tmp/output.txt" Test Mode 1 2 3 4 5 6 7 if [[ "${BASH_SOURCE[0]}" == "${0}" ]]; then # Script is being run directly main "$@" else # Script is being sourced (for testing) log_debug "Script sourced, not executing main" fi Debugging Debug Mode 1 [[ "${DEBUG:-}" == true ]] && set -x Trace Function Calls 1 2 3 4 5 6 7 8 trace() { log_debug "TRACE: ${FUNCNAME[1]}(${*})" } my_function() { trace "$@" # function body } Common Pitfalls Quote Everything 1 2 3 4 5 # Bad if [ $var == "value" ]; then # Good if [[ "$var" == "value" ]]; then Use Arrays for Lists 1 2 3 4 5 6 7 # Bad files="file1.txt file2.txt file with spaces.txt" for f in $files; do echo "$f"; done # Good files=("file1.txt" "file2.txt" "file with spaces.txt") for f in "${files[@]}"; do echo "$f"; done Check Command Success 1 2 3 4 5 6 7 8 9 10 11 # Bad output=$(some_command) echo "$output" # Good if output=$(some_command 2>&1); then echo "$output" else log_error "Command failed: $output" exit 1 fi Good bash scripts are defensive, verbose when needed, and fail fast. Start with the template, add what you need, and resist the urge to be clever. ...

February 25, 2026 · 8 min · 1525 words · Rob Washington