Ansible’s learning curve is gentle until it isn’t. Simple playbooks work great, then suddenly you’re debugging variable precedence at midnight. Here are patterns that keep automation maintainable as it scales.

Directory Structure That Scales

Forget the flat playbook approach. Use roles from day one:

ansibirpalnolnelas/eeyinpsscnpabswdbtrt/ogopoiealooamisptbterdgmntkesa.yuhgioxgs.ebc/cornnryraftsogemvsgitusleeospawdqrsn._leals.yvlbt/.yma.saymlryebmlsmral/lvseerss..yymmll

Each role follows the standard structure:

rolesthtdvm/aaeeaensnmfrtgkdpasaismlmlnum/m/mn/aeaaglaaaxirititiii/nsnensnnn./.sx...yy/.yyymmcmmmllolllnf.j2###LHDoiewgp-hep-nrpdireoinrociritieytsydevfaaruilatbsles

Variable Precedence (The Short Version)

Ansible has 22 levels of variable precedence. You don’t need to memorize them. Just follow this:

  1. defaults/main.yml — role defaults, easily overridden
  2. group_vars/ — environment/group-specific
  3. host_vars/ — single-host overrides
  4. vars/main.yml — role internals, rarely overridden
  5. Extra vars (-e) — nuclear option, overrides everything
1
2
3
4
5
6
# roles/nginx/defaults/main.yml
nginx_worker_processes: auto
nginx_worker_connections: 1024

# inventory/production/group_vars/webservers.yml
nginx_worker_connections: 4096  # Override for prod

Idempotency: The Core Principle

Every task should be safe to run twice. If it’s not, you’re doing it wrong.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# Bad - runs every time
- name: Add line to config
  shell: echo "option=value" >> /etc/app.conf

# Good - only changes if needed
- name: Ensure option in config
  lineinfile:
    path: /etc/app.conf
    line: "option=value"
    state: present

For complex file changes, use templates:

1
2
3
4
5
6
7
8
- name: Configure application
  template:
    src: app.conf.j2
    dest: /etc/app.conf
    owner: app
    group: app
    mode: '0644'
  notify: restart app

Handlers: Don’t Restart Unnecessarily

Handlers run once at the end, even if triggered multiple times:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
# tasks/main.yml
- name: Update nginx config
  template:
    src: nginx.conf.j2
    dest: /etc/nginx/nginx.conf
  notify: reload nginx

- name: Update site config
  template:
    src: site.conf.j2
    dest: /etc/nginx/sites-enabled/default
  notify: reload nginx

# handlers/main.yml
- name: reload nginx
  service:
    name: nginx
    state: reloaded

Both tasks notify the handler, but nginx reloads once.

Conditionals and Loops

When to Use when

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
- name: Install apt packages
  apt:
    name: "{{ packages }}"
    state: present
  when: ansible_os_family == "Debian"

- name: Install yum packages
  yum:
    name: "{{ packages }}"
    state: present
  when: ansible_os_family == "RedHat"

Loops That Don’t Suck

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
# Simple loop
- name: Create users
  user:
    name: "{{ item }}"
    state: present
  loop:
    - alice
    - bob
    - charlie

# Loop with dict
- name: Create users with groups
  user:
    name: "{{ item.name }}"
    groups: "{{ item.groups }}"
  loop:
    - { name: alice, groups: admin }
    - { name: bob, groups: developers }

Secrets Management

Never commit secrets. Use Ansible Vault:

1
2
3
4
5
6
7
8
# Create encrypted file
ansible-vault create inventory/production/group_vars/vault.yml

# Edit later
ansible-vault edit inventory/production/group_vars/vault.yml

# Run playbook
ansible-playbook site.yml --ask-vault-pass

Better: use a vault password file (not in git):

1
2
3
echo "supersecret" > ~/.vault_pass
chmod 600 ~/.vault_pass
ansible-playbook site.yml --vault-password-file ~/.vault_pass

Reference vault variables with a prefix:

1
2
3
4
5
# vault.yml (encrypted)
vault_db_password: "hunter2"

# group_vars/databases.yml (plain)
db_password: "{{ vault_db_password }}"

Block and Rescue: Error Handling

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
- name: Deploy application
  block:
    - name: Pull latest code
      git:
        repo: "{{ app_repo }}"
        dest: /opt/app
        version: "{{ app_version }}"
    
    - name: Install dependencies
      pip:
        requirements: /opt/app/requirements.txt
    
    - name: Run migrations
      command: python manage.py migrate
      args:
        chdir: /opt/app

  rescue:
    - name: Rollback to previous version
      git:
        repo: "{{ app_repo }}"
        dest: /opt/app
        version: "{{ app_previous_version }}"
    
    - name: Alert on failure
      slack:
        token: "{{ slack_token }}"
        channel: "#deploys"
        msg: "Deploy failed on {{ inventory_hostname }}"

  always:
    - name: Ensure app is running
      service:
        name: app
        state: started

Tags for Targeted Runs

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
- name: Install packages
  apt:
    name: nginx
  tags: [packages, nginx]

- name: Configure nginx
  template:
    src: nginx.conf.j2
    dest: /etc/nginx/nginx.conf
  tags: [config, nginx]
1
2
3
4
5
# Run only config tasks
ansible-playbook site.yml --tags config

# Skip slow tasks
ansible-playbook site.yml --skip-tags packages

Testing with Check Mode

1
2
# Dry run - shows what would change
ansible-playbook site.yml --check --diff

Some tasks need special handling:

1
2
3
4
5
- name: Get current version
  command: app --version
  register: app_version
  check_mode: no  # Always runs, even in check mode
  changed_when: false  # Never reports as changed

Performance: Don’t Wait Forever

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# ansible.cfg
[defaults]
forks = 20              # Parallel hosts
gathering = smart       # Cache facts
fact_caching = jsonfile
fact_caching_connection = /tmp/ansible_facts

[ssh_connection]
pipelining = True       # Fewer SSH connections
control_path = /tmp/ansible-%%h-%%r

For large inventories, use --limit and batching:

1
2
3
4
5
- hosts: webservers
  serial: "25%"  # Deploy to 25% at a time
  tasks:
    - include_role:
        name: app

The Golden Rules

  1. Roles for everything — even “simple” tasks grow
  2. Variables in the right place — defaults < group_vars < host_vars
  3. Idempotent always — safe to run twice
  4. Vault for secrets — never plain text
  5. Tags for speed — targeted runs save time
  6. Check mode first — verify before applying

Ansible’s power is in its simplicity. Keep playbooks readable, keep roles focused, and automation becomes maintainable instead of a liability.