YAML: The Complete Guide for 2026

Q: What is YAML and what does it stand for?

YAML stands for 'YAML Ain't Markup Language' (a recursive acronym). It is a human-readable data serialization format commonly used for configuration files, data exchange between languages, and infrastructure-as-code definitions. YAML uses indentation to represent structure, supports comments, and is widely used in tools like Kubernetes, Docker Compose, GitHub Actions, and Ansible.

Q: What is the difference between YAML and JSON?

YAML supports comments, multi-line strings, anchors/aliases for reuse, and uses indentation instead of braces for structure. JSON uses braces and brackets, requires quoted keys, has no comments, and is more compact for machine-to-machine data exchange. YAML is a superset of JSON, meaning any valid JSON document is also valid YAML. YAML is preferred for configuration files humans edit, while JSON is preferred for APIs and data interchange.

Q: What is the Norway problem in YAML?

The Norway problem refers to YAML 1.1 interpreting the country code 'NO' as boolean false instead of the string 'NO'. This happens because YAML 1.1 recognizes many bare words as booleans: yes/no, on/off, true/false, y/n. The fix is to always quote strings that could be misinterpreted: 'NO' instead of NO. YAML 1.2 resolved this by only recognizing true/false as booleans, but many parsers still default to YAML 1.1 behavior.

Q: Why should I use yaml.safe_load instead of yaml.load in Python?

The yaml.load() function in Python's PyYAML library can execute arbitrary Python code through YAML tags like !!python/object/apply:os.system. This is a serious security vulnerability if you load YAML from untrusted sources. yaml.safe_load() restricts parsing to standard YAML types (strings, numbers, lists, dicts) and prevents code execution. Always use yaml.safe_load() unless you explicitly need custom Python object deserialization from trusted sources.

Q: What is the difference between YAML 1.1 and YAML 1.2?

YAML 1.2 (released 2009) made several important changes from 1.1: booleans are limited to true/false only (no more yes/no/on/off), octal numbers use 0o prefix instead of leading zero, the spec was simplified and clarified, and JSON compatibility was improved so that all valid JSON is valid YAML 1.2. Many popular parsers like PyYAML still default to YAML 1.1 behavior, while newer libraries like ruamel.yaml and strictyaml support YAML 1.2.

February 11, 2026

YAML is everywhere in modern development. Kubernetes manifests, Docker Compose files, GitHub Actions workflows, Ansible playbooks, CI/CD pipelines — if you work with infrastructure or DevOps, you read and write YAML daily. Yet YAML is also one of the most misunderstood formats. Its deceptive simplicity hides a complex specification with subtle gotchas that have caused production outages and security vulnerabilities.

This guide covers YAML from the ground up: syntax fundamentals, data types, advanced features like anchors and multi-line strings, real-world usage patterns, security pitfalls, and the differences between YAML versions. Whether you are writing your first Kubernetes manifest or debugging a tricky indentation error, this is the reference you need.

⚙ Try it: Validate your YAML with the YAML Validator, convert between formats with JSON to YAML, or generate manifests with the Kubernetes YAML Generator.

What Is YAML and Its History

YAML stands for "YAML Ain't Markup Language" (a recursive acronym; it originally stood for "Yet Another Markup Language"). Created by Clark Evans, Ingy döt Net, and Oren Ben-Kiki, the first specification was released in 2001. YAML 1.0 arrived in 2004, YAML 1.1 in 2005, and the current version, YAML 1.2, was published in 2009.

The design goal was a data serialization format that is human-readable first and machine-parseable second. Unlike XML and JSON, which prioritize unambiguous machine parsing, YAML prioritizes how the document looks to a human editor. Indentation replaces braces. Quotes are optional for most strings. Comments are first-class. The result is a format that looks clean on screen but carries a complex specification underneath.

YAML vs JSON vs TOML

Before diving into syntax, it helps to understand where YAML fits relative to the other popular configuration and data formats.

Feature	YAML	JSON	TOML
Comments	Yes (#)	No	Yes (#)
Multi-line strings	Yes (\| and >)	No	Yes (triple quotes)
Anchors/references	Yes (& and *)	No	No
Indentation-sensitive	Yes	No	No
Multiple documents	Yes (---)	No	No
Implicit type coercion	Yes (gotcha)	No	No
Best for	Complex config, IaC	APIs, data exchange	App configuration

Choose YAML when you need deep nesting, anchors for DRY configuration, multi-document files, or when the ecosystem requires it (Kubernetes, Ansible). Choose JSON for APIs and machine-to-machine data. Choose TOML for flat-to-moderate configuration where simplicity and unambiguous parsing matter most.

⚙ Try it: Read the full comparison in our JSON vs YAML vs TOML guide, or explore the TOML Configuration Guide and JSON Complete Guide.

Basic Syntax: Scalars and Data Types

Every YAML document is built from three primitives: scalars (single values), sequences (lists), and mappings (key-value pairs). Let us start with scalars.

Strings

Strings are the most common value type and can be written several ways:

# Unquoted (plain) strings — no quotes needed for most text
name: John Doe
city: San Francisco

# Single-quoted — no escape sequences, literal backslashes
path: 'C:\Users\dev\config'
regex: '\d+\.\d+'

# Double-quoted — supports escape sequences like \n, \t, \"
greeting: "Hello,\nWorld!"
tab_separated: "col1\tcol2\tcol3"

The rule of thumb: use plain strings when there is no ambiguity, single quotes when you need literal backslashes, and double quotes when you need escape sequences or when the string starts with a special character.

Numbers, Booleans, and Null

# Integers
port: 8080
negative: -42
octal: 0o755        # YAML 1.2 octal notation
hex: 0xFF           # hexadecimal

# Floats
pi: 3.14159
scientific: 6.626e-34
infinity: .inf
not_a_number: .nan

# Booleans (YAML 1.2: only true/false)
debug: true
verbose: false

# Null (multiple representations)
value1: null
value2: ~
value3:              # empty value is also null

Pay special attention to booleans. In YAML 1.1, the words yes, no, on, off, y, n are all interpreted as booleans. YAML 1.2 restricts booleans to only true and false, but many parsers still use 1.1 rules. When in doubt, quote your strings.

Collections: Sequences and Mappings

Sequences (Lists)

Sequences are ordered lists of values, indicated by a dash and space:

# Block style sequence
fruits:
  - apple
  - banana
  - cherry

# Nested sequences
matrix:
  - [1, 2, 3]
  - [4, 5, 6]
  - [7, 8, 9]

# Sequence of mappings
employees:
  - name: Alice
    role: engineer
  - name: Bob
    role: designer
  - name: Carol
    role: manager

Mappings (Dictionaries)

Mappings are unordered collections of key-value pairs:

# Simple mapping
server:
  host: 0.0.0.0
  port: 8080
  workers: 4

# Nested mappings
database:
  primary:
    host: db-primary.example.com
    port: 5432
  replica:
    host: db-replica.example.com
    port: 5432

Block Style vs Flow Style

YAML offers two notation styles. Block style uses indentation and newlines. Flow style uses braces and brackets, similar to JSON:

# Block style (preferred for configuration files)
server:
  host: 0.0.0.0
  port: 8080
  features:
    - logging
    - metrics
    - tracing

# Flow style (compact, JSON-like)
server: {host: 0.0.0.0, port: 8080, features: [logging, metrics, tracing]}

# You can mix styles — flow inside block is common
endpoints:
  - {path: /api/users, method: GET}
  - {path: /api/users, method: POST}
  - {path: /api/health, method: GET}

Block style is more readable for configuration files. Flow style is useful for short, self-contained values that would waste vertical space in block style. Many Kubernetes examples use flow style for label selectors and small inline objects.

Multi-Line Strings: Literal and Folded Blocks

YAML's multi-line string handling is one of its most powerful and most confusing features. There are two block scalar styles, each with modifiers for chomping trailing newlines.

Literal Block Scalar (|)

Preserves newlines exactly as written. Each line break in the YAML becomes a line break in the parsed string:

# Literal block — preserves line breaks
script: |
  #!/bin/bash
  echo "Starting deployment"
  kubectl apply -f manifests/
  echo "Deployment complete"

# The parsed value is:
# "#!/bin/bash\necho \"Starting deployment\"\nkubectl apply -f manifests/\necho \"Deployment complete\"\n"

Folded Block Scalar (>)

Folds newlines into spaces, turning a paragraph into a single long line. Empty lines become actual newlines:

# Folded block — newlines become spaces
description: >
  This is a long description that spans
  multiple lines in the YAML file but will
  be parsed as a single paragraph with
  spaces replacing the line breaks.

# The parsed value is:
# "This is a long description that spans multiple lines..."

Chomping Indicators

Control what happens with the trailing newline after the block content:

# Clip (default): single trailing newline
clip: |
  text here

# Strip (-): no trailing newline
strip: |-
  text here

# Keep (+): preserve all trailing newlines
keep: |+
  text here


# (two trailing newlines preserved)

The |- (literal + strip) combination is especially common in Kubernetes ConfigMaps and CI/CD pipelines where you want exact content without a trailing newline.

Anchors and Aliases

Anchors (&) and aliases (*) let you define a value once and reuse it throughout the document. This is YAML's mechanism for DRY (Don't Repeat Yourself) configuration:

# Define an anchor with &
defaults: &default_settings
  timeout: 30
  retries: 3
  log_level: info

# Reference it with *
development:
  <<: *default_settings
  log_level: debug          # Override one value

staging:
  <<: *default_settings
  timeout: 60               # Override one value

production:
  <<: *default_settings
  retries: 5
  timeout: 120

The << key is the merge key, which merges the anchored mapping into the current mapping. Keys defined after the merge override the anchored values. This pattern is used extensively in CI/CD configurations to avoid repeating common job settings.

# Anchors on individual values
max_connections: &max_conn 100

database:
  pool_size: *max_conn      # Resolves to 100
  max_overflow: *max_conn   # Also 100

Important: Anchors and aliases are a document-level feature. They cannot reference values across separate YAML documents (separated by ---), and they cannot reference values in other files.

Tags and Custom Types

YAML tags explicitly declare the type of a value, overriding the parser's automatic type detection:

# Force a value to be a specific type
explicit_string: !!str 123       # String "123", not integer
explicit_int: !!int "456"        # Integer 456, not string
explicit_float: !!float "3.14"   # Float 3.14
explicit_bool: !!bool "true"     # Boolean true
explicit_null: !!null ""         # Null

# Useful for the Norway problem
country_code: !!str NO           # String "NO", not boolean false
version: !!str 1.0               # String "1.0", not float

Tags are essential when you need to prevent YAML's implicit type resolution from misinterpreting your data. The !!str tag is the most commonly used, typically to force numeric-looking values to remain as strings.

Multiple Documents in One File

YAML supports multiple documents in a single file, separated by --- (document start) and optionally terminated by ... (document end):

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  DATABASE_URL: postgres://localhost/mydb
---
apiVersion: v1
kind: Service
metadata:
  name: web-service
spec:
  selector:
    app: web
  ports:
    - port: 80
      targetPort: 8080
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-deployment
spec:
  replicas: 3
  template:
    spec:
      containers:
        - name: web
          image: myapp:latest

This is a fundamental pattern in Kubernetes, where a single YAML file often contains multiple related resources. The --- separator tells the parser to start a fresh document. You can pipe such files through kubectl apply -f and each document is processed independently.

YAML in Practice

Kubernetes Manifests

Kubernetes is the single biggest consumer of YAML in the development ecosystem:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-server
  labels:
    app: api
    tier: backend
spec:
  replicas: 3
  selector:
    matchLabels:
      app: api
  template:
    metadata:
      labels:
        app: api
    spec:
      containers:
        - name: api
          image: myregistry/api:v2.1.0
          ports:
            - containerPort: 8080
          env:
            - name: DATABASE_URL
              valueFrom:
                secretKeyRef:
                  name: db-credentials
                  key: url
          resources:
            requests:
              memory: "128Mi"
              cpu: "250m"
            limits:
              memory: "512Mi"
              cpu: "1000m"
          livenessProbe:
            httpGet:
              path: /healthz
              port: 8080
            initialDelaySeconds: 10
            periodSeconds: 30

⚙ Try it: Generate Kubernetes manifests with the Kubernetes YAML Generator and validate them with the YAML Validator.

Docker Compose

services:
  web:
    build: .
    ports:
      - "8080:8080"
    environment:
      - NODE_ENV=production
      - DATABASE_URL=postgres://db:5432/myapp
    depends_on:
      db:
        condition: service_healthy
    deploy:
      replicas: 2
      resources:
        limits:
          memory: 512M

  db:
    image: postgres:16
    volumes:
      - pgdata:/var/lib/postgresql/data
    environment:
      POSTGRES_DB: myapp
      POSTGRES_PASSWORD_FILE: /run/secrets/db_password
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 10s
      timeout: 5s
      retries: 5

volumes:
  pgdata:

GitHub Actions

name: CI Pipeline
on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        node-version: [18, 20, 22]
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: ${{ matrix.node-version }}
          cache: npm
      - run: npm ci
      - run: npm test
      - run: npm run build

Ansible Playbooks

---
- name: Configure web servers
  hosts: webservers
  become: true
  vars:
    http_port: 80
    max_clients: 200
  tasks:
    - name: Install nginx
      apt:
        name: nginx
        state: present
        update_cache: true

    - name: Copy nginx config
      template:
        src: templates/nginx.conf.j2
        dest: /etc/nginx/sites-available/default
      notify: restart nginx

    - name: Ensure nginx is running
      service:
        name: nginx
        state: started
        enabled: true

  handlers:
    - name: restart nginx
      service:
        name: nginx
        state: restarted

Common Gotchas and Pitfalls

The Norway Problem

In YAML 1.1, the following bare words are all interpreted as booleans:

# YAML 1.1 boolean values (all of these are booleans, NOT strings):
# true, True, TRUE, yes, Yes, YES, on, On, ON, y, Y
# false, False, FALSE, no, No, NO, off, Off, OFF, n, N

# This means country codes break:
countries:
  - GB    # string "GB"
  - US    # string "US"
  - NO    # BOOLEAN false (not the string "NO")!
  - FR    # string "FR"

# Fix: quote the values
countries:
  - "GB"
  - "US"
  - "NO"  # Now correctly a string
  - "FR"

Indentation Errors

YAML uses spaces only, never tabs. Inconsistent indentation is the most common source of YAML parse errors:

# WRONG — mixing indent levels
server:
  host: localhost
    port: 8080       # Error: unexpected indentation

# WRONG — tabs instead of spaces
server:
	host: localhost    # Error: tabs are not allowed

# CORRECT — consistent 2-space indentation
server:
  host: localhost
  port: 8080

Unquoted Strings That Look Like Other Types

# These are NOT strings without quotes:
version: 1.0          # Float 1.0, not string "1.0"
version: 1.2.3        # String "1.2.3" (not a valid number)
time: 12:30           # Sexagesimal number 750 in YAML 1.1!
zipcode: 01onal       # String (not a valid number)
zipcode: 01onal       # String
zipcode: 00501        # Octal 321 in YAML 1.1, string in 1.2

# Always quote values that should remain strings:
version: "1.0"
time: "12:30"
zipcode: "00501"

Colon in Values

# A colon followed by a space starts a mapping value
# This breaks:
message: Error: something went wrong   # Parsed as key "message" = "Error"

# Fix: quote the entire value
message: "Error: something went wrong"

# Or use a block scalar
message: |
  Error: something went wrong

YAML Security: safe_load vs load

YAML has a well-documented security vulnerability in many parser implementations. The full YAML specification allows tags that can instantiate arbitrary objects, which means loading untrusted YAML can execute arbitrary code.

# DANGEROUS — this executes a system command in Python's PyYAML
!!python/object/apply:os.system
  args: ['rm -rf /']

# Another attack vector
!!python/object/apply:subprocess.check_output
  args: [['cat', '/etc/passwd']]

The fix is simple but critical: always use safe loading functions.

# Python — ALWAYS use safe_load
import yaml

# DANGEROUS — never use with untrusted input
# data = yaml.load(content, Loader=yaml.FullLoader)

# SAFE — restricts to basic YAML types
data = yaml.safe_load(content)

# For writing
output = yaml.safe_dump(data)

// Node.js — js-yaml defaults to safe mode
const yaml = require('js-yaml');
const data = yaml.load(content);           // Safe by default since js-yaml v4
// yaml.load(content, { schema: yaml.DEFAULT_SCHEMA })  // Explicitly safe

# Ruby — use safe_load
require 'yaml'
data = YAML.safe_load(content)             # Safe
# data = YAML.load(content)               # Dangerous in older Ruby versions

YAML 1.2 vs 1.1 Differences

YAML 1.2 (2009) cleaned up several problematic behaviors from 1.1. The most important changes:

Booleans: YAML 1.2 only recognizes true and false as booleans. The 1.1 values yes, no, on, off, y, n are treated as plain strings. This fixes the Norway problem.

Octals: YAML 1.2 uses 0o777 for octal (matching modern language conventions). YAML 1.1 used 0777 (leading zero), which was confusing since most people think of leading-zero numbers as decimal.

JSON compatibility: YAML 1.2 is a strict superset of JSON. Every valid JSON document is also valid YAML 1.2. This was not guaranteed in 1.1.

Sexagesimal numbers: YAML 1.1 interprets 12:30 as the number 750 (12 * 60 + 30). YAML 1.2 treats it as the string "12:30".

Parser reality: Despite YAML 1.2 being published in 2009, many widely-used parsers still default to 1.1 behavior. Python's PyYAML uses 1.1 rules. Use ruamel.yaml or strictyaml for 1.2 compliance in Python. Go's gopkg.in/yaml.v3 and Node.js's js-yaml v4 support YAML 1.2.

YAML Processing in Different Languages

Python

import yaml  # PyYAML — YAML 1.1
# pip install pyyaml

# Read YAML
with open("config.yaml") as f:
    config = yaml.safe_load(f)

# Write YAML
with open("output.yaml", "w") as f:
    yaml.safe_dump(config, f, default_flow_style=False)

# For YAML 1.2 compliance, use ruamel.yaml:
# pip install ruamel.yaml
from ruamel.yaml import YAML
ry = YAML()
with open("config.yaml") as f:
    config = ry.load(f)

Node.js

// npm install js-yaml
const yaml = require('js-yaml');
const fs = require('fs');

// Read YAML
const config = yaml.load(fs.readFileSync('config.yaml', 'utf8'));

// Write YAML
const output = yaml.dump(config, { indent: 2, lineWidth: 120 });
fs.writeFileSync('output.yaml', output);

Go

// go get gopkg.in/yaml.v3
package main

import (
    "os"
    "gopkg.in/yaml.v3"
)

type Config struct {
    Server struct {
        Host string `yaml:"host"`
        Port int    `yaml:"port"`
    } `yaml:"server"`
}

func main() {
    data, _ := os.ReadFile("config.yaml")
    var config Config
    yaml.Unmarshal(data, &config)
}

Ruby

require 'yaml'

# Read YAML (safe_load for untrusted input)
config = YAML.safe_load(File.read('config.yaml'))

# With permitted classes for custom types
config = YAML.safe_load(
  File.read('config.yaml'),
  permitted_classes: [Date, Time]
)

# Write YAML
File.write('output.yaml', config.to_yaml)

Frequently Asked Questions

What is YAML and what does it stand for?

YAML stands for "YAML Ain't Markup Language" (a recursive acronym). It is a human-readable data serialization format used for configuration files, data exchange, and infrastructure-as-code. YAML uses indentation to represent structure, supports comments, and is the standard format for Kubernetes, Docker Compose, GitHub Actions, and Ansible.

What is the difference between YAML and JSON?

YAML supports comments, multi-line strings, anchors/aliases for reuse, and uses indentation instead of braces. JSON uses braces and brackets, requires quoted keys, has no comments, and is more compact for machine-to-machine exchange. YAML 1.2 is a superset of JSON, meaning any valid JSON is also valid YAML. YAML is preferred for human-edited configuration; JSON is preferred for APIs and data interchange.

What is the Norway problem in YAML?

The Norway problem is when YAML 1.1 interprets the country code "NO" as boolean false instead of the string "NO". YAML 1.1 treats many bare words as booleans: yes/no, on/off, true/false, y/n. The fix is to always quote strings that could be misinterpreted. YAML 1.2 resolved this by only recognizing true/false as booleans, but many parsers still default to 1.1 behavior.

Why should I use yaml.safe_load instead of yaml.load in Python?

Python's yaml.load() function can execute arbitrary code through YAML tags like !!python/object/apply:os.system. This is a critical security vulnerability when loading YAML from untrusted sources. yaml.safe_load() restricts parsing to safe standard types (strings, numbers, lists, dicts) and prevents code execution. Always use yaml.safe_load() unless you need custom object deserialization from trusted input only.

What is the difference between YAML 1.1 and YAML 1.2?

YAML 1.2 (2009) improved several areas: booleans are limited to true/false only (no more yes/no/on/off), octals use 0o prefix, sexagesimal numbers like 12:30 are treated as strings, and JSON compatibility is guaranteed. Many parsers like PyYAML still use 1.1 rules. Use ruamel.yaml for Python, gopkg.in/yaml.v3 for Go, or js-yaml v4 for Node.js for YAML 1.2 support.

Conclusion

YAML is the dominant configuration format for cloud-native infrastructure, CI/CD pipelines, and DevOps automation. Its readability makes it excellent for configuration that humans maintain, and its features like anchors, multi-line strings, and multi-document support address real needs in complex configurations.

However, YAML's implicit type coercion, indentation sensitivity, and security implications mean you must approach it with awareness. Quote strings that could be misinterpreted. Always use safe loading functions. Validate your YAML before deploying it. Understand whether your parser uses YAML 1.1 or 1.2 rules.

Master these fundamentals and YAML becomes a reliable, expressive tool rather than a source of mysterious bugs.

⚙ Essential tools: Validate YAML with the YAML Validator, convert between formats with JSON to YAML, and generate Kubernetes configs with the Kubernetes YAML Generator.