YAML looks simple until it isn’t. These gotchas have broken production configs and wasted countless debugging hours. Learn them once, avoid them forever.
The Norway Problem#
1
2
3
4
5
| # These are ALL booleans in YAML 1.1
country: NO # false
answer: yes # true
enabled: on # true
disabled: off # false
|
Fix: Always quote strings that could be interpreted as booleans.
1
2
| country: "NO"
answer: "yes"
|
YAML 1.2 fixed this, but many parsers (including PyYAML by default) still use 1.1 rules.
Implicit Type Conversion#
1
2
3
4
5
6
7
| # These aren't strings
version: 1.0 # float: 1.0
port: 8080 # integer: 8080
time: 12:30 # sexagesimal: 750 (12*60 + 30)
date: 2024-01-15 # datetime object
hex: 0x1A # integer: 26
octal: 0777 # integer: 511
|
Fix: Quote values you want as strings.
1
2
3
4
| version: "1.0"
port: "8080"
time: "12:30"
date: "2024-01-15"
|
The Docker Compose Port Trap#
1
2
3
4
5
6
7
| # WRONG - 5:30 AM becomes integer 330
ports:
- 05:30:05:30
# RIGHT - quote it
ports:
- "05:30:05:30"
|
Indentation Chaos#
Spaces vs Tabs#
YAML requires spaces. Tabs break everything.
1
2
3
4
5
6
7
| # BROKEN (tabs)
key:
value # Tab character = parse error
# CORRECT (spaces)
key:
value
|
Inconsistent Indentation#
1
2
3
4
5
6
7
8
9
| # BROKEN - mixed indentation
parent:
child1: value
child2: value # Extra space = parse error
# CORRECT
parent:
child1: value
child2: value
|
List Indentation#
1
2
3
4
5
6
7
8
9
10
11
| # Style 1: List items at same level as key
items:
- one
- two
# Style 2: List items indented
items:
- one
- two
# Both valid, but pick one and stick with it
|
Multiline Strings#
Literal Block ( | )#
Preserves newlines exactly:
1
2
3
4
| script: |
#!/bin/bash
echo "Hello"
echo "World"
|
Result: "#!/bin/bash\necho \"Hello\"\necho \"World\"\n"
Folded Block ( > )#
Folds newlines into spaces:
1
2
3
4
| description: >
This is a long description
that spans multiple lines
but becomes one paragraph.
|
Result: "This is a long description that spans multiple lines but becomes one paragraph.\n"
Chomping Indicators#
1
2
3
4
5
6
7
8
9
10
11
| # Default: single trailing newline
text: |
content
# Strip all trailing newlines
text: |-
content
# Keep all trailing newlines
text: |+
content
|
Indentation Indicator#
When content starts with spaces:
1
2
3
4
5
6
7
| # BROKEN
code: |
indented content # Parser confused
# FIXED - specify indentation
code: |2
indented content # 2-space indent relative to indicator
|
Null Values#
1
2
3
4
| # All of these are null
empty:
nothing: null
tilde: ~
|
Be explicit if you mean empty string:
Special Characters in Strings#
1
2
3
4
5
6
7
8
| # These need quoting
colon: "key: value"
hash: "not a # comment"
bracket: "[not a list]"
brace: "{not a map}"
at: "@username"
ampersand: "&anchor"
asterisk: "*reference"
|
Anchors and Aliases#
1
2
3
4
5
6
7
8
9
10
11
12
13
| # Define anchor
defaults: &defaults
timeout: 30
retries: 3
# Use alias
production:
<<: *defaults
timeout: 60 # Override
development:
<<: *defaults
retries: 1
|
Gotcha: Merge Order#
1
2
3
4
5
6
7
8
9
10
11
12
| base: &base
a: 1
b: 2
derived:
<<: *base
b: 3 # This wins (b=3)
# But if reversed:
derived:
b: 3
<<: *base # Base values don't override (b=3 still)
|
Empty Collections#
1
2
3
4
5
6
7
8
| # Empty list
items: []
# Empty map
config: {}
# NOT the same as null
items: # This is null, not empty list
|
Numeric Strings#
1
2
3
4
5
6
7
8
9
10
11
12
13
| # Phone numbers become integers (and lose leading zeros)
phone: 0123456789 # Becomes 123456789 (octal) or stripped
# Version numbers become floats
version: 1.10 # Becomes 1.1
# Zip codes
zip: 01234 # Becomes 668 (octal in 1.1) or 1234
# Always quote these
phone: "0123456789"
version: "1.10"
zip: "01234"
|
Boolean Ambiguity#
All of these are boolean true in YAML 1.1:
1
2
3
4
5
6
7
8
9
| - true
- True
- TRUE
- yes
- Yes
- YES
- on
- On
- ON
|
And these are false:
1
2
3
4
5
6
7
8
9
| - false
- False
- FALSE
- no
- No
- NO
- off
- Off
- OFF
|
1
2
3
4
5
| # Full line comment
key: value # Inline comment
# But NOT inside quotes
key: "value # not a comment"
|
JSON is Valid YAML#
1
2
3
4
5
6
7
| # This works in YAML
{"key": "value", "list": [1, 2, 3]}
# Mixed styles (valid but ugly)
items:
- {"inline": "json"}
- normal: yaml
|
1
2
3
4
5
6
7
8
| # Python
python -c "import yaml; yaml.safe_load(open('config.yaml'))"
# yamllint
yamllint config.yaml
# yq (like jq for YAML)
yq eval '.' config.yaml
|
yamllint Configuration#
1
2
3
4
5
6
7
8
9
| # .yamllint
extends: default
rules:
line-length:
max: 120
truthy:
allowed-values: ['true', 'false']
indentation:
spaces: 2
|
Safe Loading#
1
2
3
4
5
6
7
8
9
10
| import yaml
# DANGEROUS - allows arbitrary code execution
data = yaml.load(file, Loader=yaml.Loader)
# SAFE - use this
data = yaml.safe_load(file)
# Or explicitly
data = yaml.load(file, Loader=yaml.SafeLoader)
|
Quick Reference: When to Quote#
Always quote:
- Country codes (NO, AT)
- Version numbers (1.0, 2.10)
- Time values (12:30)
- Values starting with special chars
- Phone numbers, zip codes
- Anything that looks like a number but isn’t
- Values containing colons or hashes
1
2
3
4
5
6
7
| # Safe defaults
country: "NO"
version: "1.0"
time: "12:30"
phone: "555-1234"
path: "/some/path"
regex: "^[a-z]+$"
|
YAML’s “human-readable” design created a minefield of implicit behaviors. The fix is simple: when in doubt, quote it. Your future self debugging a production outage at 2 AM will thank you.
Use yamllint in CI, prefer explicit types, and remember: if it looks like it might be magic, it probably is.