What Is a Text Diff?
A diff (short for "difference") is the result of comparing two pieces of text to find what changed between them. The output highlights added, removed, and modified lines so you can quickly understand the delta without reading both documents in full. Diff tools are foundational in software engineering — every code review, merge conflict, and version control operation relies on them.
At its core, a diff algorithm solves the longest common subsequence (LCS) problem: given two sequences of lines, find the longest subsequence present in both. Everything outside that subsequence is a change. The way those changes are grouped and presented determines the quality of the diff output.
How Diff Algorithms Work
Line-by-line comparison is the simplest approach. Each line in Text A is compared to each line in Text B. While conceptually straightforward, a naive implementation has O(n×m) complexity, which becomes slow for large files.
The Myers diff algorithm, published by Eugene Myers in 1986, is the standard used by Git, GNU diff, and most modern diff tools. It finds the shortest edit script — the minimum number of insertions and deletions needed to transform one text into the other — in O(n×d) time, where d is the size of the edit script. For texts that are mostly similar (the common case), this is extremely fast.
Patience diff is a variation that produces more human-readable output. Instead of optimizing purely for the shortest edit script, it first matches unique lines that appear in both texts, using those as anchors to align the surrounding content. This avoids the confusing results Myers sometimes produces when functions or blocks are reordered.
Unified diff format is the standard textual representation. Lines prefixed with - were removed, lines prefixed with + were added, and lines with a space prefix are unchanged context. Header lines starting with @@ indicate the line ranges of each hunk.
Common Use Cases
Code review — Before merging a pull request, reviewers examine the diff to understand every change. A clean, minimal diff makes review faster and catches bugs earlier.
Configuration comparison — When a service behaves differently between environments, comparing the two config files side by side instantly reveals the discrepancy. This is especially useful for YAML, JSON, and TOML configs where a single changed value can alter behavior.
Merge conflict resolution — Version control systems flag conflicting changes from different branches. Understanding the diff of each branch against the common ancestor helps you craft the correct resolution.
Data validation — Comparing expected vs. actual output in test results, API responses, or database exports helps pinpoint exactly which fields changed. This is far more efficient than manual inspection of large data sets.
Documentation auditing — Tracking changes to legal documents, API specifications, or runbooks ensures that modifications are intentional and reviewed.
JSON Structural Diff vs. Text Diff
When comparing JSON documents, a plain text diff can be misleading. Reordering object keys, changing indentation, or reformatting arrays produces large text diffs even when the data is semantically identical. A structural diff understands the JSON data model and compares values rather than characters.
For example, {"a":1,"b":2} and {"b":2,"a":1} are semantically identical — a structural diff reports no changes, while a text diff flags every line. When debugging API responses or config files, always consider whether you need a text-level or structural-level comparison.
This tool supports both approaches. Paste raw text for a line-by-line comparison, or paste JSON to see differences in the context of the document structure. For best results with JSON, make sure both inputs are consistently formatted — use the JSON Formatter first.
Tips for Effective Diffing
Normalize formatting first. Inconsistent whitespace, trailing newlines, and different indentation styles create noise in diffs. Run both texts through a formatter before comparing to focus on meaningful changes.
Sort keys in structured data. If you're comparing JSON or YAML objects, sorting keys alphabetically eliminates false positives from key reordering.
Use context to orient yourself. When reviewing large diffs, the unchanged lines around each hunk provide essential context. If you see a change but don't understand why it matters, look at the surrounding lines to identify which function or section it belongs to.
Break large changes into smaller diffs. A 2,000-line diff is nearly impossible to review effectively. If you're making sweeping changes, split them into logical chunks and compare each one independently.
Ignore irrelevant changes. Timestamps, auto-generated IDs, and build hashes change on every run. Strip or mask these fields before diffing to avoid cluttering the output.
Related Tools
Use these tools alongside the diff checker for a smoother workflow:
- JSON Formatter — Pretty-print JSON before diffing to get cleaner, more readable comparisons.
- YAML ↔ JSON Converter — Convert between YAML and JSON so you can diff configs in a consistent format.
Frequently Asked Questions
Does the diff run in my browser or on a server?
The comparison runs entirely in your browser using JavaScript. Your text is never sent to a server, so you can safely compare sensitive configuration files, credentials (though you should rotate them after), and proprietary code.
What happens with very large files?
The browser-based diff handles most files up to several megabytes. For extremely large files (tens of thousands of lines), performance may degrade. In those cases, consider diffing smaller sections or using a command-line tool like diff or git diff.
Why do I see changes in lines that look identical?
Invisible characters — trailing spaces, tabs vs. spaces, different line endings (LF vs. CRLF) — can cause lines to differ even when they look the same. Copy both lines into a hex viewer or Base64-encode them to reveal hidden differences.
Can I compare binary files?
This tool is designed for text-based content. Binary files (images, compiled executables, archives) will not produce meaningful diffs. Use specialized tools like hexdump combined with diff, or format-specific comparison tools for binary data.