Ansible’s learning curve is gentle until it isn’t. Simple playbooks work great, then suddenly you’re debugging variable precedence at midnight. Here are patterns that keep automation maintainable as it scales.
Directory Structure That Scales# Forget the flat playbook approach. Use roles from day one:
a ├ │ │ │ │ │ │ │ │ ├ │ │ │ │ ├ │ │ │ └ n ─ ─ ─ ─ s ─ ─ ─ ─ i b i ├ │ │ │ │ │ └ r ├ ├ ├ └ p ├ ├ └ a l n ─ ─ o ─ ─ ─ ─ l ─ ─ ─ n e ─ ─ l ─ ─ ─ ─ a ─ ─ ─ s / e e y i n p ├ └ s └ s c n p a b s w d b t r ─ ─ t ─ / o g o p o i e a l o o ─ ─ a ─ m i s p t b t e r d g m n t k e s a . y u h g ├ ├ └ i o x g s . e b c / c o r ─ ─ ─ n n r y r a f t s o ─ ─ ─ g e m v s g i t u s l e e o s p a w d q r s n . _ l e a l s . y v l b t / . y m a . s a y m l r y e b m l s m r a l / l v s e e r s s . . y y m m l l Each role follows the standard structure:
r ├ │ ├ │ ├ │ ├ │ ├ │ └ o ─ ─ ─ ─ ─ ─ l ─ ─ ─ ─ ─ ─ e s t └ h └ t └ d └ v └ m └ / a ─ a ─ e ─ e ─ a ─ e ─ n s ─ n ─ m ─ f ─ r ─ t ─ g k d p a s a i s m l m l n u m / m / m n / a e a a g l a a a x i r i t i t i i i / n s n e n s n n n . / . s x . . . y y / . y y y m m c m m m l l o l l l n f . j 2 # # # L H D o i e w g p - h e p - n r p d i r e o i n r o c i r i t i e y t s y d e v f a a r u i l a t b s l e s Variable Precedence (The Short Version)# Ansible has 22 levels of variable precedence. You don’t need to memorize them. Just follow this:
defaults/main.yml — role defaults, easily overriddengroup_vars/ — environment/group-specifichost_vars/ — single-host overridesvars/main.yml — role internals, rarely overriddenExtra vars (-e) — nuclear option, overrides everything1
2
3
4
5
6
# roles/nginx/defaults/main.yml
nginx_worker_processes : auto
nginx_worker_connections : 1024
# inventory/production/group_vars/webservers.yml
nginx_worker_connections : 4096 # Override for prod
Idempotency: The Core Principle# Every task should be safe to run twice. If it’s not, you’re doing it wrong.
1
2
3
4
5
6
7
8
9
10
# Bad - runs every time
- name : Add line to config
shell : echo "option=value" >> /etc/app.conf
# Good - only changes if needed
- name : Ensure option in config
lineinfile :
path : /etc/app.conf
line : "option=value"
state : present
For complex file changes, use templates:
1
2
3
4
5
6
7
8
- name : Configure application
template :
src : app.conf.j2
dest : /etc/app.conf
owner : app
group : app
mode : '0644'
notify : restart app
Handlers: Don’t Restart Unnecessarily# Handlers run once at the end, even if triggered multiple times:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# tasks/main.yml
- name : Update nginx config
template :
src : nginx.conf.j2
dest : /etc/nginx/nginx.conf
notify : reload nginx
- name : Update site config
template :
src : site.conf.j2
dest : /etc/nginx/sites-enabled/default
notify : reload nginx
# handlers/main.yml
- name : reload nginx
service :
name : nginx
state : reloaded
Both tasks notify the handler, but nginx reloads once.
Conditionals and Loops# When to Use when# 1
2
3
4
5
6
7
8
9
10
11
- name : Install apt packages
apt :
name : "{{ packages }}"
state : present
when : ansible_os_family == "Debian"
- name : Install yum packages
yum :
name : "{{ packages }}"
state : present
when : ansible_os_family == "RedHat"
Loops That Don’t Suck# 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# Simple loop
- name : Create users
user :
name : "{{ item }}"
state : present
loop :
- alice
- bob
- charlie
# Loop with dict
- name : Create users with groups
user :
name : "{{ item.name }}"
groups : "{{ item.groups }}"
loop :
- { name: alice, groups : admin }
- { name: bob, groups : developers }
Secrets Management# Never commit secrets. Use Ansible Vault:
1
2
3
4
5
6
7
8
# Create encrypted file
ansible-vault create inventory/production/group_vars/vault.yml
# Edit later
ansible-vault edit inventory/production/group_vars/vault.yml
# Run playbook
ansible-playbook site.yml --ask-vault-pass
Better: use a vault password file (not in git):
1
2
3
echo "supersecret" > ~/.vault_pass
chmod 600 ~/.vault_pass
ansible-playbook site.yml --vault-password-file ~/.vault_pass
Reference vault variables with a prefix:
1
2
3
4
5
# vault.yml (encrypted)
vault_db_password : "hunter2"
# group_vars/databases.yml (plain)
db_password : "{{ vault_db_password }}"
Block and Rescue: Error Handling# 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
- name : Deploy application
block :
- name : Pull latest code
git :
repo : "{{ app_repo }}"
dest : /opt/app
version : "{{ app_version }}"
- name : Install dependencies
pip :
requirements : /opt/app/requirements.txt
- name : Run migrations
command : python manage.py migrate
args :
chdir : /opt/app
rescue :
- name : Rollback to previous version
git :
repo : "{{ app_repo }}"
dest : /opt/app
version : "{{ app_previous_version }}"
- name : Alert on failure
slack :
token : "{{ slack_token }}"
channel : "#deploys"
msg : "Deploy failed on {{ inventory_hostname }}"
always :
- name : Ensure app is running
service :
name : app
state : started
1
2
3
4
5
6
7
8
9
10
- name : Install packages
apt :
name : nginx
tags : [ packages, nginx]
- name : Configure nginx
template :
src : nginx.conf.j2
dest : /etc/nginx/nginx.conf
tags : [ config, nginx]
1
2
3
4
5
# Run only config tasks
ansible-playbook site.yml --tags config
# Skip slow tasks
ansible-playbook site.yml --skip-tags packages
Testing with Check Mode# 1
2
# Dry run - shows what would change
ansible-playbook site.yml --check --diff
Some tasks need special handling:
1
2
3
4
5
- name : Get current version
command : app --version
register : app_version
check_mode : no # Always runs, even in check mode
changed_when : false # Never reports as changed
1
2
3
4
5
6
7
8
9
10
# ansible.cfg
[ defaults]
forks = 20 # Parallel hosts
gathering = smart # Cache facts
fact_caching = jsonfile
fact_caching_connection = /tmp/ansible_facts
[ ssh_connection]
pipelining = True # Fewer SSH connections
control_path = /tmp/ansible-%%h-%%r
For large inventories, use --limit and batching:
1
2
3
4
5
- hosts : webservers
serial : "25%" # Deploy to 25% at a time
tasks :
- include_role :
name : app
The Golden Rules# Roles for everything — even “simple” tasks growVariables in the right place — defaults < group_vars < host_varsIdempotent always — safe to run twiceVault for secrets — never plain textTags for speed — targeted runs save timeCheck mode first — verify before applyingAnsible’s power is in its simplicity. Keep playbooks readable, keep roles focused, and automation becomes maintainable instead of a liability.