URL Encode/Decode

← Tools

What Is URL Encoding?

URL encoding, formally known as percent-encoding, is a mechanism for representing characters in a URI (Uniform Resource Identifier) that would otherwise have special meaning or are not allowed in certain positions. It works by replacing unsafe characters with a % sign followed by two hexadecimal digits representing the character's byte value in UTF-8.

For example, a space character becomes %20, an ampersand & becomes %26, and a forward slash / becomes %2F. Without this encoding, these characters would be interpreted as structural delimiters within the URL rather than as literal data.

Percent-encoding is defined in RFC 3986 (URIs) and is one of the most fundamental standards of the web. Every time you click a link containing a search query, fill out a form, or make an API call, percent-encoding is happening behind the scenes to ensure your data arrives intact.

Why Percent-Encoding Is Needed

URIs have a strict grammar. Certain characters are reserved because they carry structural meaning: : separates the scheme from the authority, / separates path segments, ? begins the query string, # begins the fragment, and & separates query parameters.

If you want to include any of these characters as data rather than structure—say, a search query that contains an ampersand like "Tom & Jerry"—you must percent-encode them. Otherwise, the URL parser will misinterpret & as a parameter separator and split your query incorrectly.

Non-ASCII characters (accented letters, CJK characters, emoji) must also be percent-encoded. The character is first converted to its UTF-8 byte sequence, and then each byte is percent-encoded individually. For example, the euro sign € (U+20AC) is UTF-8 bytes E2 82 AC, which becomes %E2%82%AC.

RFC 3986 Reserved and Unreserved Characters

RFC 3986 divides characters into three categories:

Unreserved characters are safe to use anywhere in a URI without encoding: A–Z a–z 0–9 - . _ ~. These should never be percent-encoded, and encoding them is technically valid but bad practice because it creates unnecessarily ugly URLs.

Reserved characters have special meaning in URIs: : / ? # [ ] @ ! $ & ' ( ) * + , ; =. When used for their reserved purpose (e.g., ? to start a query), they must not be encoded. When used as data within a component, they must be encoded.

All other characters (spaces, non-ASCII, control characters) must always be percent-encoded before inclusion in a URI.

The most commonly encoded characters in practice are: space (%20), & (%26), = (%3D), + (%2B), / (%2F), and # (%23).

encodeURI vs encodeURIComponent

JavaScript provides two built-in functions for URL encoding, and confusing them is one of the most common sources of bugs in web development:

encodeURI() encodes a full URI. It leaves reserved characters that are structurally significant (: / ? # & =) untouched, only encoding characters that are completely illegal in any part of a URI. Use this when you have a complete URL and want to make it safe.

encodeURIComponent() encodes a single URI component (a query parameter value, a path segment, etc.). It encodes everything except unreserved characters, including &, =, /, and +. Use this when you're building a URL piece by piece and need to encode individual parameter values.

The classic mistake: using encodeURI() on a query parameter value. If the value contains & or =, they won't be encoded and will be interpreted as parameter delimiters, corrupting your query string. Always use encodeURIComponent() for individual values.

Form Encoding vs Path Encoding

There are two subtly different encoding schemes commonly encountered on the web:

application/x-www-form-urlencoded is the encoding used by HTML forms when submitted with the default POST method. It encodes spaces as + (not %20) and encodes most special characters. This is also the format used by jQuery's $.param() and PHP's http_build_query().

RFC 3986 percent-encoding encodes spaces as %20 and is used in path segments, fragment identifiers, and modern API query strings. JavaScript's encodeURIComponent() uses this scheme.

The difference between + and %20 for spaces is a frequent source of confusion. When working with form data, use +. When building URL paths or working with RESTful APIs, use %20. Most server frameworks handle both, but some edge cases (especially in signed URLs or HMAC validation) are sensitive to this distinction.

Best Practices & Tips

Never double-encode. If a string is already percent-encoded, encoding it again turns %20 into %2520. Always decode first if you're unsure whether the input is already encoded, then re-encode as needed.

Use URL builder APIs. Instead of concatenating strings manually, use your language's URL builder: URL and URLSearchParams in JavaScript, urllib.parse in Python, http_build_query() in PHP. These handle encoding correctly and prevent injection bugs.

Be careful with path segments. A / in a path segment must be encoded as %2F to avoid being treated as a path separator. This matters when file names or identifiers contain slashes.

Test with edge cases. Characters like +, %, non-ASCII characters, and emoji are the most likely to cause encoding issues. Include them in your test data when building URL-handling code.

Related Tools

URL encoding often comes up alongside other encoding and formatting tasks:

  • Base64 Encode/Decode — When you need to embed binary data in a URL, you typically Base64-encode it first, then percent-encode the result (or use Base64url to avoid the extra step).
  • JSON Formatter — Format and validate JSON payloads that are often passed as URL query parameters in API debugging.

Frequently Asked Questions

Is my data sent to a server?

No. All encoding and decoding happens locally in your browser. Nothing is transmitted over the network.

Should spaces be encoded as %20 or +?

It depends on context. In URL path segments and modern APIs, use %20. In HTML form submissions (application/x-www-form-urlencoded), spaces are represented as +. Most servers accept both in query strings, but be precise when generating signed URLs or HMACs.

What is double encoding and how do I avoid it?

Double encoding occurs when already-encoded text is encoded again, turning %20 into %2520. To avoid it, always work with decoded data and encode only once, at the point where you construct the final URL.

Why do some characters appear encoded in my browser's address bar?

Browsers display URLs with decoded Unicode characters for readability but use the encoded form internally. If you copy the URL from the address bar, the browser may give you the decoded version. Use "Copy as URL" or check your browser's network inspector for the actual encoded form.

Can I use URL encoding for security purposes?

No. URL encoding is a transport encoding, not a security measure. It doesn't hide or protect data—it just ensures safe transmission through URL syntax. For security, use proper encryption and authentication.