JSON vs YAML vs TOML: A Comprehensive Comparison for Developers
Every software project needs configuration. Whether you are setting up a web server, defining a CI/CD pipeline, or publishing a package, you are choosing a format to store structured settings. The three most common options in 2026 are JSON, YAML, and TOML, and each one makes fundamentally different tradeoffs between simplicity, readability, and expressiveness.
This guide provides a thorough, side-by-side comparison to help you make an informed choice. We will look at syntax, data types, real-world use cases, known pitfalls, and ecosystem support. By the end, you will know exactly which format fits your project and why.
The Same Config in Three Formats
Before diving into theory, let us look at the same configuration expressed in all three formats. This is a realistic application config with a server section, database settings, and a list of allowed origins.
JSON
{
"app_name": "my-web-app",
"version": "2.1.0",
"debug": false,
"server": {
"host": "0.0.0.0",
"port": 8080,
"workers": 4,
"tls": {
"enabled": true,
"cert": "/etc/ssl/cert.pem",
"key": "/etc/ssl/key.pem"
}
},
"database": {
"host": "localhost",
"port": 5432,
"name": "myapp_production",
"pool_size": 20,
"timeout_ms": 5000
},
"allowed_origins": [
"https://example.com",
"https://app.example.com",
"https://staging.example.com"
],
"logging": {
"level": "info",
"format": "json",
"file": "/var/log/myapp/app.log"
}
}
YAML
# Application configuration
app_name: my-web-app
version: "2.1.0" # Must quote to prevent float interpretation
debug: false
server:
host: 0.0.0.0
port: 8080
workers: 4
tls:
enabled: true
cert: /etc/ssl/cert.pem
key: /etc/ssl/key.pem
database:
host: localhost
port: 5432
name: myapp_production
pool_size: 20
timeout_ms: 5000
allowed_origins:
- https://example.com
- https://app.example.com
- https://staging.example.com
logging:
level: info
format: json
file: /var/log/myapp/app.log
TOML
# Application configuration
app_name = "my-web-app"
version = "2.1.0"
debug = false
[server]
host = "0.0.0.0"
port = 8080
workers = 4
[server.tls]
enabled = true
cert = "/etc/ssl/cert.pem"
key = "/etc/ssl/key.pem"
[database]
host = "localhost"
port = 5432
name = "myapp_production"
pool_size = 20
timeout_ms = 5000
allowed_origins = [
"https://example.com",
"https://app.example.com",
"https://staging.example.com",
]
[logging]
level = "info"
format = "json"
file = "/var/log/myapp/app.log"
At a glance, all three are readable. But the differences become meaningful at scale and under pressure, when a production deployment depends on getting the syntax exactly right.
Feature-by-Feature Comparison
The following table compares the three formats across every dimension that matters for configuration files. Green indicates an advantage, red indicates a disadvantage or missing feature, and yellow indicates partial support.
| Feature | JSON | YAML | TOML |
|---|---|---|---|
| Comments | ✗ No | ✓ Yes (#) | ✓ Yes (#) |
| Native date/time type | ✗ No | ~ Implicit | ✓ RFC 3339 |
| Trailing commas | ✗ No | N/A | ✓ Yes |
| Indentation-sensitive | ✓ No | ✗ Yes | ✓ No |
| Implicit type coercion | ✓ None | ✗ Extensive | ✓ None |
| Multi-document support | ✗ No | ✓ Yes (---) | ✗ No |
| Anchors & references | ✗ No | ✓ Yes (& / *) | ✗ No |
| Deep nesting (4+ levels) | ✓ Natural | ✓ Natural | ~ Verbose |
| Spec complexity | Short (RFC 8259) | Very long (80+ pages) | Short (~3K words) |
| Multiline strings | ✗ No (use \n) | ✓ Yes (| and >) | ✓ Yes (""" and ''') |
| Null value | ✓ null | ✓ null / ~ | ✗ No |
| API data interchange | ✓ Standard | ~ Possible | ✗ Not designed for it |
| Duplicate key handling | ~ Undefined (impl varies) | ~ Last value wins | ✓ Error (rejected) |
| Primary use case | Data exchange, APIs | Infrastructure config, CI/CD | App config, manifests |
JSON: The Universal Data Format
What It Is
JSON (JavaScript Object Notation) was formalized by Douglas Crockford in the early 2000s, based on a subset of JavaScript syntax. It is defined in RFC 8259 and has become the lingua franca of data interchange on the web. Every modern programming language includes a JSON parser in its standard library.
Syntax Overview
JSON has six data types: strings, numbers, booleans, null, arrays, and objects. Every string must be double-quoted. Every key must be a string. There are no comments, no trailing commas, and no date type.
{
"string": "hello world",
"number": 42,
"float": 3.14,
"boolean": true,
"null_value": null,
"array": [1, 2, 3],
"nested": {
"key": "value"
}
}
Strengths
- Universal support. Every language, every platform, every tool understands JSON. It is the default for REST APIs, GraphQL responses, browser storage, and inter-service communication.
- Unambiguous parsing. The spec is short and precise. Two conformant parsers will always produce the same result from the same input (with the exception of duplicate keys, which the spec allows but discourages).
- Machine-friendly. JSON is trivially generated and consumed by code. Serialization libraries are fast, well-tested, and available everywhere.
- No surprises. There is no implicit type coercion. A value is exactly the type it looks like.
Weaknesses
- No comments. This is the single biggest limitation for configuration files. You cannot explain what a setting does, warn about side effects, or leave notes for future developers.
- Verbose syntax. Every key must be quoted. Objects and arrays require delimiters. A moderately complex config file becomes a wall of curly braces and quotation marks.
- No trailing commas. Adding or removing the last item in an array or object requires modifying two lines, creating noisy diffs in version control.
- No multiline strings. Long strings must use escape sequences like
\n, making them hard to read. - No date type. Dates are stored as strings, and every application has to agree on a format convention (usually ISO 8601) and parse manually.
Where JSON Dominates
- REST and GraphQL API responses
package.json,tsconfig.json,package-lock.json- Browser
localStorageandsessionStorage - Inter-service communication (microservices)
- NoSQL databases (MongoDB, CouchDB)
- Any context where the file is generated and consumed by machines
YAML: The Human-Friendly Heavyweight
What It Is
YAML (YAML Ain't Markup Language) was first released in 2001 and is currently at version 1.2.2 (October 2021). It was designed to be a human-readable data serialization format, and it is technically a superset of JSON: every valid JSON document is also valid YAML.
Syntax Overview
YAML uses indentation (spaces, never tabs) to represent nesting. Colons separate keys from values. Dashes denote list items. It supports a rich set of data types including strings, numbers, booleans, null, dates, arrays, and mappings.
# Scalars
string: hello world
quoted_string: "contains: colon"
number: 42
float: 3.14
boolean: true
null_value: null
date: 2026-02-11
# Sequence (array)
fruits:
- apple
- banana
- cherry
# Mapping (object)
server:
host: 0.0.0.0
port: 8080
# Multiline strings
description: |
This is a block scalar.
Line breaks are preserved.
Indentation is stripped.
summary: >
This is a folded scalar.
Line breaks become spaces.
Good for long paragraphs.
# Anchors and references
defaults: &default_settings
timeout: 30
retries: 3
production:
<<: *default_settings
timeout: 60
Strengths
- Readable at any depth. Indentation-based nesting looks clean even at 5+ levels deep, making YAML excellent for deeply nested infrastructure definitions.
- Comments. Full-line and inline comments with
#, essential for documenting configuration. - Anchors and aliases. The
&and*operators let you define a value once and reference it elsewhere, reducing repetition in large configs. - Multi-document support. A single YAML file can contain multiple documents separated by
---, useful for Kubernetes manifests and similar use cases. - Multiline strings. Block scalars (
|for literal,>for folded) handle long text gracefully. - Expressive. YAML can represent any data structure that JSON can, plus additional types and features.
Weaknesses
- Implicit type coercion (the "Norway Problem"). This is YAML's most infamous issue. In YAML 1.1, the value
NOis parsed as booleanfalse. Country codes, version numbers, and other innocent values are silently reinterpreted:
# These are all booleans in YAML 1.1, not strings:
country: NO # false (Norway disappears)
answer: yes # true
enabled: on # true
disabled: off # false
# This is a float, not a version string:
version: 1.0 # 1.0 (float)
# This is an integer, not a string:
zipcode: 010 # 8 (octal interpretation)
- Indentation sensitivity. A single misplaced space can change the structure of the document without causing a parse error, leading to bugs that are extremely difficult to track down.
- Massive specification. The YAML 1.2 spec is over 80 pages. Most developers understand only a subset of YAML, yet the parser handles the full spec. This leads to unexpected behavior with features developers did not know existed.
- Security concerns. Some YAML parsers support arbitrary object instantiation, which has led to remote code execution vulnerabilities. Always use safe loading functions (
yaml.safe_load()in Python, for example). - Tabs vs spaces. YAML requires spaces for indentation. A single tab character is a parse error, but many editors mix them silently.
Where YAML Dominates
- Kubernetes manifests and Helm charts
- GitHub Actions, GitLab CI, CircleCI, and most CI/CD platforms
- Ansible playbooks and roles
- Docker Compose files
- OpenAPI / Swagger specifications
- Any context with deep nesting where readability matters
TOML: The Configuration Specialist
What It Is
TOML (Tom's Obvious, Minimal Language) was created by Tom Preston-Werner (co-founder of GitHub) in 2013, with version 1.0.0 released in January 2021. It was designed specifically for configuration files, not general data serialization. The goal: a format that is easy for humans, unambiguous for parsers, and maps cleanly to a hash table.
Syntax Overview
TOML uses key = value pairs organized into [sections] (tables). It supports strings, integers, floats, booleans, dates, arrays, and tables. All types are explicit and unambiguous.
# Scalars
string = "hello world"
number = 42
float = 3.14
boolean = true
date = 2026-02-11
datetime = 2026-02-11T09:30:00Z
# Arrays
fruits = ["apple", "banana", "cherry"]
# Tables (sections)
[server]
host = "0.0.0.0"
port = 8080
# Nested tables
[server.tls]
enabled = true
cert = "/etc/ssl/cert.pem"
# Inline tables
point = { x = 1, y = 2 }
# Array of tables
[[routes]]
path = "/api"
handler = "api_handler"
[[routes]]
path = "/health"
handler = "health_check"
# Multi-line strings
description = """
This is a multi-line string.
Quotes and backslashes work as expected."""
# Literal string (no escaping)
regex = '\d+\.\d+'
Strengths
- No surprises. Every value has an explicit, unambiguous type. There is no implicit coercion.
"yes"is always a string,trueis always a boolean,1.0is always a float. - Comments. First-class comment support with
#. - Native dates. RFC 3339 date/time types are built into the language, not parsed implicitly from strings.
- Short specification. A developer can read the entire TOML spec in one sitting. There are no hidden features or ambiguous edge cases.
- Not indentation-sensitive. Structure comes from section headers and key names, not whitespace. This eliminates an entire class of invisible bugs.
- Duplicate key rejection. Redefining a key is a parse error, not a silent override. This prevents accidental data loss in large configs.
- Trailing commas in arrays. Smaller diffs when adding items.
Weaknesses
- Verbose deep nesting. At 3+ levels of nesting, TOML requires long dotted table headers like
[a.b.c.d]. This becomes unwieldy compared to YAML's natural indentation or JSON's nested braces. - No null type. If you need to express "this value is explicitly absent," TOML cannot do it. You must omit the key entirely.
- No anchors or references. No way to reuse values without external tooling. If the same connection string appears in three places, you write it three times.
- Smaller ecosystem. While growing rapidly, TOML has fewer parser libraries and less tooling than JSON or YAML in some languages.
- Not suitable for data interchange. TOML was designed for config files, not API responses or data serialization between services.
Where TOML Dominates
Cargo.toml(Rust package manager)pyproject.toml(Python project configuration, PEP 517/518/621)hugo.toml(Hugo static site generator)netlify.toml(Netlify deployment config)taplo.toml,deno.jsonalternatives, and many new tools- Any application configuration file with moderate nesting
Head-to-Head: Readability
Readability depends on context. Here is the same nested structure in all three formats to illustrate how each handles increasing complexity.
Shallow config (1-2 levels) — All three are fine
At shallow nesting depths, all three formats are readable and the choice is largely a matter of preference. YAML is the most concise, TOML is explicit, and JSON is familiar.
Deep config (4+ levels) — YAML wins
# YAML handles deep nesting naturally
kubernetes:
deployment:
spec:
template:
spec:
containers:
- name: api
image: myapp:latest
ports:
- containerPort: 8080
env:
- name: DB_HOST
value: postgres.default.svc
# TOML becomes verbose at this depth
[[kubernetes.deployment.spec.template.spec.containers]]
name = "api"
image = "myapp:latest"
[[kubernetes.deployment.spec.template.spec.containers.ports]]
containerPort = 8080
[[kubernetes.deployment.spec.template.spec.containers.env]]
name = "DB_HOST"
value = "postgres.default.svc"
The YAML version is clearly more readable for deeply nested structures. The TOML version repeats the full path for each section, which becomes tedious. This is by design: TOML optimizes for flat-to-moderate configs, not deeply hierarchical data.
Head-to-Head: Safety and Correctness
Safety in configuration means: "Does the parser produce the data structure I expect?" This is where the three formats diverge most dramatically.
The YAML Trap: Implicit Types
# What a developer writes:
countries:
- GB
- IE
- NO
- FR
# What YAML 1.1 parsers produce:
# ["GB", "IE", false, "FR"]
# Norway just became a boolean.
# More YAML surprises:
version: 1.0 # float, not string "1.0"
port: 0800 # integer 512 (octal), not 800
value: .inf # infinity, not the string ".inf"
answer: yes # boolean true, not string "yes"
YAML 1.2 fixed some of these issues (only true and false are booleans, not yes/no), but many popular parsers still default to YAML 1.1 behavior. Python's PyYAML, one of the most widely used YAML libraries, defaults to YAML 1.1 semantics.
JSON and TOML: What You See Is What You Get
In JSON, "NO" is always a string, 42 is always a number, and true is always a boolean. There is no ambiguity. In TOML, the same is true: "NO" is a string, 42 is an integer, and true is a boolean. Both formats trade convenience for safety.
The practical impact is significant. A YAML config file that works correctly in development can silently produce different data in production if the values happen to match one of YAML's implicit type patterns. JSON and TOML eliminate this entire category of bugs.
Ecosystem and Tooling Comparison
JSON Ecosystem
- Parsers: Built into every language's standard library. No installation needed.
- Schema validation: JSON Schema is a mature, widely-adopted standard. SchemaStore.org provides schemas for hundreds of formats.
- Editors: Universal syntax highlighting, formatting, and validation in every editor and IDE.
- CLI tools:
jqis the gold standard for querying and transforming JSON from the command line.
YAML Ecosystem
- Parsers: Available in all major languages. Python:
PyYAML,ruamel.yaml. JavaScript:js-yaml. Go:gopkg.in/yaml.v3. Rust:serde_yaml. - Schema validation: JSON Schema works (YAML is a superset of JSON). Kubernetes uses CRDs and OpenAPI schemas.
yamllintprovides linting. - Editors: Strong support in all editors. VS Code's YAML extension (Red Hat) provides schema-driven autocomplete for Kubernetes, Docker Compose, GitHub Actions, and more.
- CLI tools:
yqprovides jq-like querying for YAML.yamllintcatches formatting issues.
TOML Ecosystem
- Parsers: Python 3.11+ has
tomllibin stdlib. Rust:tomlcrate. Go:BurntSushi/toml. JavaScript:smol-toml,@iarna/toml. Java:toml4j. - Schema validation:
taplosupports JSON Schema for TOML. SchemaStore.org has schemas forCargo.toml,pyproject.toml, and others. - Editors: VS Code's "Even Better TOML" (taplo) extension provides formatting, validation, and schema-driven autocomplete. JetBrains has built-in TOML support.
- CLI tools:
taploprovides formatting, validation, and linting.toml-cliprovides command-line querying.
When to Use Each Format: A Decision Guide
Choosing a configuration format is not about which is "best." It is about which one fits your specific constraints. Use this decision framework.
Use JSON when:
- The file is primarily machine-generated and machine-consumed (API responses, lock files, serialized state)
- You need maximum interoperability across languages and platforms
- The file is part of a data pipeline, not a configuration that humans edit
- You are working with web APIs, NoSQL databases, or browser storage
- The toolchain mandates it (TypeScript, ESLint, Prettier, package.json)
Use YAML when:
- The configuration has deep nesting (4+ levels) and readability matters
- You need anchors and references to reduce repetition in large configs
- The ecosystem mandates it (Kubernetes, Docker Compose, GitHub Actions, Ansible)
- You need multi-document support in a single file
- The team is already experienced with YAML and knows to quote ambiguous values
Use TOML when:
- The configuration has moderate nesting (1-3 levels)
- Humans will frequently read and edit the file
- You want explicit typing without implicit coercion surprises
- The file is an application config, package manifest, or tool settings
- The ecosystem supports it (Rust, Python, Hugo, Netlify)
- You value simplicity and safety over maximum expressiveness
Quick Reference Cheat Sheet
| Scenario | Recommended Format |
|---|---|
| REST API response | JSON |
| Package lock file | JSON |
| TypeScript / ESLint / Prettier config | JSON |
| Kubernetes manifest | YAML |
| CI/CD pipeline (GitHub Actions, GitLab CI) | YAML |
| Docker Compose | YAML |
| Ansible playbook | YAML |
| Rust package manifest | TOML |
| Python project config (pyproject.toml) | TOML |
| Application settings file | TOML |
| Static site generator config | TOML |
Common Mistakes to Avoid
Here are the most common mistakes developers make with each format, compiled from years of Stack Overflow questions and production incidents.
JSON Mistakes
// WRONG: trailing comma (parse error)
{
"name": "myapp",
"version": "1.0", <-- trailing comma
}
// WRONG: single quotes (parse error)
{ 'name': 'myapp' }
// WRONG: comments (parse error)
{
// This is not valid JSON
"name": "myapp"
}
// WRONG: unquoted keys (parse error)
{ name: "myapp" }
YAML Mistakes
# WRONG: using tabs for indentation (parse error)
server:
host: localhost # Tab character = parse error
# WRONG: forgetting to quote special values
country: NO # Becomes boolean false
version: 1.0 # Becomes float 1.0
port: 08080 # Becomes integer 4160 (octal)
# WRONG: inconsistent indentation
server:
host: localhost
port: 8080 # 3 spaces instead of 2 = different structure
# FIX: always quote ambiguous values
country: "NO"
version: "1.0"
port: 8080 # No leading zero
TOML Mistakes
# WRONG: bare string values (parse error)
name = hello # Must be "hello"
# WRONG: redefining a key (parse error)
port = 8080
port = 9090 # Duplicate key rejected
# WRONG: extending inline tables (parse error)
point = { x = 1, y = 2 }
point.z = 3 # Cannot add to inline table
# WRONG: mixing table and array-of-tables
[server]
host = "alpha"
[[server]] # Error: already defined as table
host = "beta"
Migration Tips: Converting Between Formats
Sometimes you need to convert configuration files from one format to another. Here are practical approaches.
JSON to YAML
# Using yq (the YAML equivalent of jq)
yq -P config.json > config.yaml
# Using Python
python3 -c "
import json, yaml, sys
data = json.load(open('config.json'))
yaml.dump(data, sys.stdout, default_flow_style=False)
"
YAML to JSON
# Using yq
yq -o=json config.yaml > config.json
# Using Python
python3 -c "
import json, yaml, sys
data = yaml.safe_load(open('config.yaml'))
json.dump(data, sys.stdout, indent=2)
"
JSON to TOML / YAML to TOML
# Using Python (requires tomli_w: pip install tomli-w)
python3 -c "
import json, tomli_w, sys
data = json.load(open('config.json'))
tomli_w.dump(data, sys.stdout.buffer)
"
# Using taplo for formatting after conversion
taplo format config.toml
Important caveat: automatic conversion does not preserve comments (JSON has none to begin with), and YAML-to-TOML conversion may fail if the YAML contains deeply nested structures or types that TOML does not support (like null values).
Frequently Asked Questions
What is the difference between JSON, YAML, and TOML?
JSON is a strict data interchange format with no comments, designed for machine-to-machine communication. YAML is a human-friendly superset of JSON that uses indentation for nesting and supports comments, anchors, and multi-document files. TOML is a minimal configuration format that uses explicit key-value pairs with section headers, supports comments and native date types, and avoids indentation-based nesting. Each format excels in different contexts: JSON for APIs, YAML for complex infrastructure config, and TOML for application configuration files.
Should I use YAML or JSON for configuration files?
If humans will regularly read and edit the file, YAML is generally better than JSON because it supports comments and is less verbose. However, YAML's indentation sensitivity and implicit type coercion can cause subtle bugs. If the configuration is simple (1-2 levels of nesting), consider TOML instead. If the file is machine-generated and machine-consumed (like API responses or package-lock.json), JSON is the better choice due to its strict, unambiguous parsing.
Why does YAML have a reputation for being dangerous?
YAML's implicit type coercion is the primary source of problems. Values like "yes", "no", "on", "off", "true", and "false" are automatically converted to booleans. Country codes like "NO" (Norway) become boolean false. Unquoted version strings like 1.0 become floats. Octal numbers like 0777 may be interpreted unexpectedly. The indentation-sensitive syntax also means a single misplaced space can completely change the data structure without causing a parse error. These issues have caused real production incidents and security vulnerabilities.
When should I use TOML instead of YAML?
Use TOML when your configuration has moderate nesting (1-3 levels), you want explicit typing without surprises, and the file will be hand-edited by developers. TOML is ideal for application settings, package manifests (like Cargo.toml and pyproject.toml), and tool configuration. Use YAML instead when you need deep nesting (4+ levels), anchors and references to reduce repetition, or when your ecosystem mandates it (Kubernetes, GitHub Actions, Ansible).
Conclusion
There is no single "best" configuration format. JSON, YAML, and TOML each solve different problems, and the right choice depends on your specific context.
JSON is the right choice when you need universal interoperability and the file is primarily consumed by machines. Its strict syntax and ubiquitous parser support make it the default for APIs, data interchange, and toolchain configuration. Accept its limitations (no comments, verbose syntax) when those limitations do not matter because humans rarely edit the file.
YAML is the right choice when you need expressive power for deeply nested, complex configurations. Kubernetes manifests, CI/CD pipelines, and infrastructure-as-code tools chose YAML because its indentation-based nesting handles 5-10 levels of depth elegantly. Accept the complexity tradeoff (implicit typing, indentation sensitivity) and mitigate it with linting, quoting discipline, and YAML 1.2-compliant parsers.
TOML is the right choice when humans are the primary audience and the configuration is moderately complex. Package manifests, application settings, and tool configuration benefit from TOML's explicit typing, comment support, and short specification. Its adoption by Rust, Python, Hugo, and Netlify validates this positioning. Accept that deep nesting becomes verbose and that you will need external tooling for value reuse.
In practice, most developers use all three formats across different projects. The key insight is not to pick a favorite, but to understand the strengths and limitations of each so you can choose deliberately rather than by default.