YAML ↔ JSON
What Is YAML?
YAML (YAML Ain't Markup Language) is a human-friendly data serialization format. Originally released in 2001, it was designed to be easy to read and write by hand — making it the go-to choice for configuration files across the software industry. Kubernetes manifests, Docker Compose files, Ansible playbooks, GitHub Actions workflows, and Spring Boot configurations all use YAML.
YAML relies on indentation (spaces, never tabs) to denote structure rather than brackets or braces. This makes well-written YAML visually clean and scannable, but it also means that whitespace errors can cause subtle parsing failures. YAML is technically a superset of JSON — every valid JSON document is also valid YAML.
What Is JSON?
JSON (JavaScript Object Notation) is a lightweight data interchange format defined in RFC 8259. Despite its name, JSON is language-independent and supported natively by virtually every programming language. It has become the dominant format for REST APIs, web application communication, and structured data storage.
JSON's syntax is strict and unambiguous: data is represented as objects (key-value pairs in curly braces), arrays (ordered values in square brackets), strings (double-quoted), numbers, booleans (true/false), and null. There are no comments, no trailing commas, and no alternative quoting styles. This rigidity makes JSON trivial to parse programmatically but sometimes tedious to write and maintain by hand.
Key Differences Between YAML and JSON
Comments — YAML supports comments with #. JSON has no comment syntax at all. This single difference is a major reason teams choose YAML for configuration files — inline documentation is invaluable when explaining why a setting exists.
Readability — YAML's indentation-based syntax is generally easier for humans to scan, especially for deeply nested structures. JSON's bracket-heavy syntax is more compact on a single line but harder to read when pretty-printed with deep nesting.
Data types — JSON supports strings, numbers, booleans, null, arrays, and objects. YAML adds dates, timestamps, and multi-line strings natively. YAML also performs implicit type coercion, which can be surprising (see gotchas below).
Multi-line strings — YAML offers block scalars (| for literal, > for folded) that make embedding multi-line text like SQL queries, scripts, or templates natural. In JSON, multi-line content must be escaped as a single string with \n characters.
Strictness — JSON's rigid syntax means fewer surprises: parsers across languages produce identical results. YAML's flexibility introduces ambiguity that can vary between parser implementations.
When to Use Which
Use YAML for configuration files. When humans need to read, write, and maintain a file regularly — and when comments are valuable — YAML is the better choice. DevOps configurations, CI/CD pipelines, application settings, and infrastructure-as-code templates all benefit from YAML's readability.
Use JSON for data interchange. When machines are the primary consumers — API request/response bodies, message queue payloads, structured logs, and data exports — JSON's strict syntax and universal parser support make it the safer choice. Its lack of ambiguity ensures the sender and receiver always agree on the data.
Use JSON for programmatic generation. If your code generates the data (rather than a human writing it), JSON is simpler to produce correctly. Generating valid YAML programmatically requires careful handling of indentation, quoting, and special characters.
Common YAML Gotchas
The Norway problem — In YAML 1.1, the bare value NO is interpreted as boolean false (along with no, off, n, and others). This famously caused issues with country code lists where Norway (NO) became false. YAML 1.2 reduced this to only true/false, but many parsers still default to 1.1 behavior. Always quote strings that could be misinterpreted: "NO", "yes", "on", "off".
Indentation errors — YAML requires spaces (not tabs) for indentation, and the number of spaces matters. A single extra or missing space can silently change the document's structure. Use an editor with YAML support and visible whitespace to catch these issues early.
Implicit typing — Values like 1.0, 0x1A, 1_000, and 2024-01-15 are automatically parsed as floats, hex integers, integers, and dates respectively. If you intend these as strings, wrap them in quotes. This is the most common source of bugs when converting between YAML and JSON.
Duplicate keys — YAML technically allows duplicate keys in a mapping, with the last value winning. Most linters flag this as an error, but parsers may silently accept it, leading to confusing behavior where a setting appears to be configured but is overridden later in the file.
YAML Anchors and Aliases
YAML supports anchors (&name) and aliases (*name) for reusing content within a document. Define an anchor on a value, then reference it elsewhere with an alias to avoid repetition:
defaults: &defaults timeout: 30 retries: 3 production: <<: *defaults timeout: 60
The << merge key combined with an alias inserts all key-value pairs from the anchor, which can then be selectively overridden. This is widely used in CI/CD configurations (e.g., GitLab CI) to share settings across jobs. Note that anchors and aliases have no equivalent in JSON — when converting YAML with anchors to JSON, the referenced values are expanded inline.
JSON Schema
JSON Schema is a vocabulary for annotating and validating JSON documents. It defines the expected structure, data types, required fields, value constraints, and relationships within a JSON document. Editors and IDEs use JSON Schema to provide autocompletion and validation for configuration files — including YAML files, since most YAML can be represented as JSON.
When working with APIs or configuration formats, check if a JSON Schema is available. Validating your data against the schema before deployment catches structural errors that would otherwise surface as runtime failures.
Related Tools
These tools pair well with YAML/JSON conversion:
- JSON Formatter — Pretty-print, minify, and validate JSON output from the converter.
- Text/JSON Diff — Compare two YAML or JSON files to find differences after conversion or editing.
Frequently Asked Questions
Is every JSON document valid YAML?
Yes. Since YAML 1.2, JSON is an official subset of YAML. Any conforming YAML 1.2 parser can parse a JSON document. However, the reverse is not true — most YAML documents use features (comments, anchors, multi-line strings) that have no JSON equivalent.
Why does my YAML convert to unexpected JSON values?
YAML's implicit typing is the usual culprit. Bare values like yes, no, on, off, and strings that look like numbers or dates are coerced into booleans, integers, floats, or date objects. Always quote ambiguous values in YAML to ensure they remain strings.
Can YAML represent everything JSON can?
Yes, and more. YAML supports all JSON data types plus additional types like dates, timestamps, binary data, and ordered mappings. The only caveat is that some YAML features (anchors, comments) are lost when converting to JSON, since JSON has no way to represent them.
Which YAML version should I target?
YAML 1.2 (released in 2009) is the current specification and fixes many of the implicit typing issues from YAML 1.1. However, some widely-used parsers (notably PyYAML) still default to YAML 1.1 behavior. Check your parser's documentation and explicitly configure 1.2 mode if available.