Web Development, Vue, RFC Standards·

Building a Base64 Encode Tool

How we built a browser-native Base64 encoder with full UTF-8 Unicode support, Base64URL mode for JWT and OAuth tokens, and zero network requests — all using TextEncoder and btoa().

Why Base64 Encoding Matters

Base64 is one of those foundational encoding schemes that every developer encounters — often without fully understanding what it does or why it exists. HTTP Basic Authentication credentials travel as Base64 strings. JWT tokens are Base64URL-encoded in three segments. Data URIs embed images directly in HTML using Base64. Email attachments have been Base64-encoded for decades via MIME.

The idea is simple: take arbitrary binary data (bytes with values 0–255) and express it using only 64 printable ASCII characters that are safe to transmit over any medium originally designed for text. The trade-off is a ~33% size increase — every 3 bytes become 4 characters.

We built this tool to make Base64 encoding accessible and correct, handling the common pitfall that catches many developers: the UTF-8 problem.

The RFC 4648 Alphabet

RFC 4648 defines the Base64 encoding scheme used across the internet. The standard alphabet is 64 characters:

  • A–Z (26 uppercase letters, values 0–25)
  • a–z (26 lowercase letters, values 26–51)
  • 0–9 (10 digits, values 52–61)
  • + (value 62) and / (value 63)
  • = (padding character, not in the alphabet itself)

The encoding works by grouping the input bytes in sets of 3. Each group of 3 bytes (24 bits) is split into four 6-bit groups, and each 6-bit value (0–63) maps to the corresponding Base64 character. When the input length is not a multiple of 3, one or two = padding characters are appended to the output.

For example, the word Man (3 bytes: 0x4D 0x61 0x6E) encodes to TWFu — four characters, no padding needed.

The UTF-8 Problem: Why btoa() Alone Breaks on Emoji

Here is the trap that catches developers using JavaScript's built-in btoa() directly:

btoa("hello"); // ✅ 'aGVsbG8='
btoa("é"); // ❌ DOMException: The string to be encoded contains characters outside of the Latin1 range.
btoa("😀"); // ❌ DOMException

btoa() expects a "binary string" — a string where every character has a code point between 0 and 255. JavaScript strings, however, are UTF-16 internally. The character é (U+00E9) fits within that range as a single code point, but its UTF-8 representation is two bytes (0xC3 0xA9). An emoji like 😀 (U+1F600) is stored as a surrogate pair in UTF-16 and occupies four bytes in UTF-8 (0xF0 0x9F 0x98 0x80).

If you want to Base64-encode the actual UTF-8 byte representation of a string — which is what servers and other languages expect — you need to encode the UTF-8 bytes first, then pass the resulting binary string to btoa().

The TextEncoder Solution

The native TextEncoder API is the clean answer. It converts any JavaScript string to its UTF-8 byte representation as a Uint8Array:

function encodeBase64(text: string): string {
  if (!text) return "";
  const bytes = new TextEncoder().encode(text);
  // Array.from() avoids stack overflow on large inputs;
  // TextEncoder handles all Unicode including emoji and BMP+ characters via UTF-8
  const binaryString = Array.from(bytes, (byte) =>
    String.fromCharCode(byte),
  ).join("");
  return btoa(binaryString);
}

Two implementation details are worth noting:

Why Array.from() and not spread? The spread pattern String.fromCharCode(...bytes) works but throws a RangeError: Maximum call stack size exceeded for large byte arrays because it passes all array elements as individual function arguments. Array.from() with a mapping function processes elements one at a time and has no such limit.

Why not encodeURIComponent? A common older approach wraps btoa(unescape(encodeURIComponent(text))). This works but produces output that corresponds to the percent-encoded UTF-8 bytes, not the raw UTF-8 bytes — which means the encoded output does not match what other platforms expect when they decode as UTF-8.

With TextEncoder, the test vectors align with RFC 4648 exactly:

InputUTF-8 bytesBase64 output
Man4D 61 6ETWFu
éC3 A9w6k=
😀F0 9F 98 808J+YgA==

Base64URL Mode for JWT and OAuth Tokens

Standard Base64 uses + and / in its alphabet. These characters are unsafe in URLs — + is interpreted as a space in query strings, and / conflicts with path separators. For JWT headers and payloads, OAuth tokens, and URL parameters, RFC 4648 §5 defines Base64URL: a variant that substitutes - for + and _ for /, and omits the = padding.

The implementation is three substitutions applied after standard encoding:

if (urlSafeMode) {
  result = result.replace(/\+/g, "-").replace(/\//g, "_").replace(/=/g, "");
}

For example, the sequence >>? (which contains bytes that map to Base64 characters using + and /) encodes as Pj4/ in standard mode and Pj4_ in Base64URL mode.

The tool's URL-safe toggle triggers an immediate re-encode (no debounce) so the output switches instantaneously when the user flips the checkbox. The input watcher uses a 150ms debounce to avoid unnecessary recomputation on every keystroke.

The "Not Encryption" Misconception

Base64 comes up frequently in discussions about security, often because developers see it in authentication headers and assume it provides some protection. It does not. Base64 is reversible in one function call:

atob("SGVsbG8sIFdvcmxkIQ=="); // 'Hello, World!'

Anyone who can see a Base64-encoded string can decode it instantly. The Authorization: Basic header in HTTP carries credentials encoded only as Base64 — the entire scheme relies on HTTPS for confidentiality. Similarly, JWT payloads are Base64URL-encoded and fully readable by anyone who intercepts the token; the signature provides integrity, not confidentiality.

The tool includes a prominent disclaimer directly on the page: "Base64 is an encoding scheme, not encryption. Anyone with the encoded string can decode it instantly."

Privacy-First Architecture: Zero Network Requests

The tool follows the same privacy architecture as the URL Encoder, URL Decoder, and HTML Encoder: all processing happens in the browser using native web APIs. No text you enter leaves your device.

The implementation uses only:

  • TextEncoder (Web API, no import needed)
  • btoa() (global, no import needed)
  • navigator.clipboard.writeText() (Clipboard API, local operation)
  • @vueuse/core's useDebounceFn (client-side utility)

There are no fetch calls, no axios imports, no analytics scripts, no WebSocket connections. You can verify this yourself by opening the browser's Network tab and encoding any text — you will see zero outbound requests.

Key Takeaways

  • btoa() alone is not sufficient for Unicode — you must convert to UTF-8 bytes first using TextEncoder
  • Array.from() is safer than spread for converting byte arrays to binary strings in JavaScript
  • Base64URL is three substitutions after standard encoding: +→-, /→_, remove =
  • Base64 is encoding, not encryption — encode it in HTTPS, not because Base64 protects it
  • Debounce input, immediate-respond to toggles — a pattern that keeps tools responsive without wasteful recomputation
  • TextEncoder is a first-class Web API available in all modern browsers with no polyfill needed

Code Cultivation • © 2026