Ansible playbooks get messy fast. Roles fix this by packaging related tasks, variables, and handlers into reusable units. But bad roles are worse than no roles — they hide complexity without managing it.
Here’s how to design roles that actually help.
The Basic Structure# r o l m e y s _ t h d v t f m / r a a e a e i e o s m n m f m r m m c l s t m l k a d a a a s a p o e c a a e s i l i u i / i l n s r / i / / n e n l n n a f / i n . r . t . . t i p . y s y s y y e g t y m / m / m m s . . m l l l l j s l 2 h # # # # # # # E E D R J S R n v e o i t o t e f l n a l r n a e j t e y t u a i - l v 2 c m p t t a e o r r t f t i i v i e i a n g a a m l d t g r b p e a e i l l s t r a e a a e b s t d l e a e ( s n t s h d a i s ( g d k l h e s o e p w r e e n s p d t r e e n p c c r e i e d e c e s e n d c e e n ) c e ) Only create directories you need. An empty handlers/ is noise.
Pattern 1: Validate Before Acting# Fail fast with clear error messages:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# tasks/main.yml
- name : Validate required variables
ansible.builtin.assert :
that :
- app_name is defined
- app_name | length > 0
- app_port | int > 0
- app_port | int < 65536
fail_msg : |
Missing or invalid required variables:
- app_name: {{ app_name | default('UNDEFINED') }}
- app_port: {{ app_port | default('UNDEFINED') }}
success_msg : "All required variables validated"
- name : Validate environment
ansible.builtin.assert :
that :
- environment in ['dev', 'staging', 'prod']
fail_msg : "environment must be one of: dev, staging, prod"
Pattern 2: Sensible Defaults with Override Points# 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# defaults/main.yml
---
# Application settings
app_name : "" # REQUIRED - no default
app_port : 8080 # Optional - sensible default
app_workers : "{{ ansible_processor_vcpus }}" # Dynamic default
# Feature flags
app_enable_ssl : true
app_enable_metrics : true
# Paths (overridable but rarely needed)
app_base_dir : "/opt/{{ app_name }}"
app_config_dir : "{{ app_base_dir }}/config"
app_log_dir : "/var/log/{{ app_name }}"
# Timeouts
app_start_timeout : 60
app_health_check_interval : 10
Document what’s required vs optional. Use dynamic defaults where sensible.
Pattern 3: Idempotent Tasks# Every task should be safe to run multiple times:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# ❌ Not idempotent - appends every run
- name : Add config line
ansible.builtin.shell : echo "setting=value" >> /etc/app.conf
# ✅ Idempotent - only adds if missing
- name : Add config line
ansible.builtin.lineinfile :
path : /etc/app.conf
line : "setting=value"
state : present
# ❌ Not idempotent - always runs
- name : Initialize database
ansible.builtin.command : /opt/app/init-db.sh
# ✅ Idempotent - checks first
- name : Check if database initialized
ansible.builtin.stat :
path : /opt/app/.db_initialized
register : db_init_flag
- name : Initialize database
ansible.builtin.command : /opt/app/init-db.sh
when : not db_init_flag.stat.exists
- name : Mark database initialized
ansible.builtin.file :
path : /opt/app/.db_initialized
state : touch
when : not db_init_flag.stat.exists
Pattern 4: Handler Chains# Handlers run once at the end, in definition order:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# handlers/main.yml
---
- name : Validate config
ansible.builtin.command : /opt/app/validate-config.sh
listen : "config changed"
- name : Reload app
ansible.builtin.systemd :
name : "{{ app_name }}"
state : reloaded
listen : "config changed"
- name : Verify app health
ansible.builtin.uri :
url : "http://localhost:{{ app_port }}/health"
status_code : 200
retries : 5
delay : 2
listen : "config changed"
1
2
3
4
5
6
# tasks/main.yml
- name : Deploy configuration
ansible.builtin.template :
src : app.conf.j2
dest : "{{ app_config_dir }}/app.conf"
notify : "config changed" # Triggers all handlers listening
Pattern 5: Conditional Includes# Split complex roles into focused task files:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# tasks/main.yml
---
- name : Include OS-specific tasks
ansible.builtin.include_tasks : "{{ ansible_os_family | lower }}.yml"
- name : Include installation tasks
ansible.builtin.include_tasks : install.yml
when : app_state == 'present'
- name : Include removal tasks
ansible.builtin.include_tasks : remove.yml
when : app_state == 'absent'
- name : Include SSL configuration
ansible.builtin.include_tasks : ssl.yml
when : app_enable_ssl | bool
1
2
3
4
5
6
7
# tasks/debian.yml
---
- name : Install dependencies (Debian)
ansible.builtin.apt :
name : "{{ app_packages_debian }}"
state : present
update_cache : true
Pattern 6: Role Dependencies# Declare dependencies in meta:
1
2
3
4
5
6
7
8
9
10
11
12
# meta/main.yml
---
dependencies :
- role : common
vars :
common_timezone : UTC
- role : firewall
vars :
firewall_allowed_ports :
- "{{ app_port }}/tcp"
when : app_configure_firewall | default(true)
But prefer explicit includes in playbooks over implicit dependencies — easier to debug.
Pattern 7: Output Facts for Chaining# Set facts that other roles or tasks can use:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# tasks/main.yml
- name : Get application version
ansible.builtin.command : /opt/app/bin/app --version
register : app_version_output
changed_when : false
- name : Set application facts
ansible.builtin.set_fact :
app_installed : true
app_version : "{{ app_version_output.stdout | regex_search('\\d+\\.\\d+\\.\\d+') }}"
app_endpoint : "http://{{ ansible_default_ipv4.address }}:{{ app_port }}"
- name : Export facts for other roles
ansible.builtin.set_fact :
my_role_result :
installed : "{{ app_installed }}"
version : "{{ app_version }}"
endpoint : "{{ app_endpoint }}"
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# tasks/main.yml
- name : Install application
ansible.builtin.include_tasks : install.yml
tags :
- install
- never # Only runs when explicitly tagged
- name : Configure application
ansible.builtin.include_tasks : configure.yml
tags :
- configure
- name : Deploy application
ansible.builtin.include_tasks : deploy.yml
tags :
- deploy
- name : Verify application
ansible.builtin.include_tasks : verify.yml
tags :
- verify
- always # Runs unless --skip-tags verify
Run specific parts: ansible-playbook site.yml --tags configure,verify
Pattern 9: Block Error Handling# 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
- name : Deploy with rollback capability
block :
- name : Backup current version
ansible.builtin.copy :
src : "{{ app_base_dir }}"
dest : "{{ app_base_dir }}.backup"
remote_src : true
- name : Deploy new version
ansible.builtin.unarchive :
src : "{{ app_artifact }}"
dest : "{{ app_base_dir }}"
- name : Verify deployment
ansible.builtin.uri :
url : "http://localhost:{{ app_port }}/health"
status_code : 200
retries : 3
delay : 5
rescue :
- name : Restore from backup
ansible.builtin.copy :
src : "{{ app_base_dir }}.backup"
dest : "{{ app_base_dir }}"
remote_src : true
- name : Restart with old version
ansible.builtin.systemd :
name : "{{ app_name }}"
state : restarted
- name : Fail with message
ansible.builtin.fail :
msg : "Deployment failed and was rolled back"
always :
- name : Clean up backup
ansible.builtin.file :
path : "{{ app_base_dir }}.backup"
state : absent
Anti-Patterns to Avoid# 1. God Roles
1
2
# ❌ One role doing everything
- role : application # 500 lines, handles install, config, deploy, monitor
Split into focused roles: app_install, app_configure, app_deploy.
2. Hardcoded Values
1
2
3
4
5
6
7
8
9
10
11
# ❌ Hardcoded
- name : Create user
ansible.builtin.user :
name : appuser
uid : 1001
# ✅ Parameterized
- name : Create user
ansible.builtin.user :
name : "{{ app_user }}"
uid : "{{ app_user_uid | default(omit) }}"
3. Shell When Modules Exist
1
2
3
4
5
6
7
8
9
# ❌ Shell for everything
- name : Install package
ansible.builtin.shell : apt-get install -y nginx
# ✅ Use the module
- name : Install package
ansible.builtin.apt :
name : nginx
state : present
4. Missing changed_when
1
2
3
4
5
6
7
8
9
10
# ❌ Always shows changed
- name : Check status
ansible.builtin.command : systemctl status app
# ✅ Accurate reporting
- name : Check status
ansible.builtin.command : systemctl status app
register : status_result
changed_when : false
failed_when : status_result.rc not in [0, 3]
Testing Roles# Use Molecule for role testing:
1
2
3
4
5
6
7
8
9
10
11
12
13
# molecule/default/molecule.yml
---
driver :
name : docker
platforms :
- name : ubuntu
image : ubuntu:22.04
- name : rocky
image : rockylinux:9
provisioner :
name : ansible
verifier :
name : ansible
1
2
3
4
5
6
7
8
9
10
11
12
# molecule/default/verify.yml
---
- name : Verify
hosts : all
tasks :
- name : Check service is running
ansible.builtin.service_facts :
- name : Assert service is running
ansible.builtin.assert :
that :
- services['myapp.service'].state == 'running'
The Checklist# Validate inputs — Fail fast with clear messagesUse defaults — Make common cases easyBe idempotent — Safe to run repeatedlyDocument variables — What’s required, what’s optionalUse handlers — For restart/reload cascadesTag appropriately — Enable selective executionSet output facts — For role chainingTest with Molecule — Before it hits productionGood roles disappear into the background. You use them without thinking about them. That’s the goal.