Base64
Encyclopedia
Base64 is a group of similar encoding schemes that represent binary data in an ASCII string format by translating it into a radix
Radix
In mathematical numeral systems, the base or radix for the simplest case is the number of unique digits, including zero, that a positional numeral system uses to represent numbers. For example, for the decimal system the radix is ten, because it uses the ten digits from 0 through 9.In any numeral...

-64 representation. The Base64 term originates from a specific MIME content transfer encoding.

Base64 encoding schemes are commonly used when there is a need to encode binary data that needs be stored and transferred over media that are designed to deal with textual data. This is to ensure that the data remains intact without modification during transport. Base64 is commonly used in a number of applications including email
Email
Electronic mail, commonly known as email or e-mail, is a method of exchanging digital messages from an author to one or more recipients. Modern email operates across the Internet or other computer networks. Some early email systems required that the author and the recipient both be online at the...

 via MIME
MIME
Multipurpose Internet Mail Extensions is an Internet standard that extends the format of email to support:* Text in character sets other than ASCII* Non-text attachments* Message bodies with multiple parts...

, and storing complex data in XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....

.

Design

The particular choice of character set selected for the 64 characters required for the base varies between implementations. The general rule is to choose a set of 64 characters that is both part of a subset common to most encodings, and also printable. This combination leaves the data unlikely to be modified in transit through information systems, such as email, that were traditionally not 8-bit clean
8-bit clean
8-bit clean describes a computer system that correctly handles 8-bit character sets, such as the ISO 8859 series and the UTF-8 encoding of Unicode.- History :...

. For example, MIME's Base64 implementation uses AZ, az, and 09 for the first 62 values. Other variations, usually derived from Base64, share this property but differ in the symbols chosen for the last two values; an example is UTF-7
UTF-7
UTF-7 is a variable-length character encoding that was proposed for representing Unicode text using a stream of ASCII characters...

.

The earliest instances of this type of encoding were created for dialup communication between systems running the same OS
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

 — e.g. uuencode for UNIX
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...

, BinHex
BinHex
BinHex, short for "binary-to-hexadecimal", is a binary-to-text encoding system that was used on the Mac OS for sending binary files through e-mail. It is similar to Uuencode, but combined both "forks" of the Mac file system together, along with extended file information...

 for the TRS-80
TRS-80
TRS-80 was Tandy Corporation's desktop microcomputer model line, sold through Tandy's Radio Shack stores in the late 1970s and early 1980s. The first units, ordered unseen, were delivered in November 1977, and rolled out to the stores the third week of December. The line won popularity with...

 (later adapted for the Macintosh
Macintosh
The Macintosh , or Mac, is a series of several lines of personal computers designed, developed, and marketed by Apple Inc. The first Macintosh was introduced by Apple's then-chairman Steve Jobs on January 24, 1984; it was the first commercially successful personal computer to feature a mouse and a...

) — and could therefore make more assumptions about what characters were safe to use. For instance, uuencode uses uppercase letters, digits, and many punctuation characters, but no lowercase, since UNIX was sometimes used with terminals
Computer terminal
A computer terminal is an electronic or electromechanical hardware device that is used for entering data into, and displaying data from, a computer or a computing system...

 that did not support distinct letter case
Letter case
In orthography and typography, letter case is the distinction between the larger majuscule and smaller minuscule letters...

.

Examples

A quote from Thomas Hobbes
Thomas Hobbes
Thomas Hobbes of Malmesbury , in some older texts Thomas Hobbs of Malmsbury, was an English philosopher, best known today for his work on political philosophy...

' Leviathan
Leviathan (book)
Leviathan or The Matter, Forme and Power of a Common Wealth Ecclesiasticall and Civil — commonly called simply Leviathan — is a book written by Thomas Hobbes and published in 1651. Its name derives from the biblical Leviathan...

:
Man is distinguished, not only by his reason, but by this singular passion from other animals, which is a lust of the mind, that by a perseverance of delight in the continued and indefatigable generation of knowledge, exceeds the short vehemence of any carnal pleasure.


represented as a byte sequence of 8-bit-padded ASCII
ASCII
The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...

 characters is encoded in MIME's Base64 scheme as follows:

TWFuIGlzIGRpc3Rpbmd1aXNoZWQsIG5vdCBvbmx5IGJ5IGhpcyByZWFzb24sIGJ1dCBieSB0aGlz
IHNpbmd1bGFyIHBhc3Npb24gZnJvbSBvdGhlciBhbmltYWxzLCB3aGljaCBpcyBhIGx1c3Qgb2Yg
dGhlIG1pbmQsIHRoYXQgYnkgYSBwZXJzZXZlcmFuY2Ugb2YgZGVsaWdodCBpbiB0aGUgY29udGlu
dWVkIGFuZCBpbmRlZmF0aWdhYmxlIGdlbmVyYXRpb24gb2Yga25vd2xlZGdlLCBleGNlZWRzIHRo
ZSBzaG9ydCB2ZWhlbWVuY2Ugb2YgYW55IGNhcm5hbCBwbGVhc3VyZS4=



In the above quote the encoded value of Man is TWFu. Encoded in ASCII, M, a, n are stored as the bytes 77, 97, 110, which are, in 8-bit quantities, 01001101, 01100001, 01101110 in base 2. These three bytes are joined together into a 24 bit buffer producing 010011010110000101101110. Packs of 6 bits (6 bits have a maximum of 64 different binary values) are converted into numbers (in this case, there are 4 numbers in this 24-bit string), which are then converted to their corresponding values in Base64.
Text content M a n
ASCII 77 97 110
Bit pattern 0 1 0 0 1 1 0 1 0 1 1 0 0 0 0 1 0 1 1 0 1 1 1 0
Index 19 22 5 46
Base64-encoded T W F u


As this example illustrates, Base64 encoding converts 3 octets
Octet (computing)
An octet is a unit of digital information in computing and telecommunications that consists of eight bits. The term is often used when the term byte might be ambiguous, as there is no standard for the size of the byte.-Overview:...

 into 4 encoded characters.

The Base64 index table:
Value Char   Value Char   Value Char   Value Char
0 A 16 Q 32 g 48 w
1 B 17 R 33 h 49 x
2 C 18 S 34 i 50 y
3 D 19 T 35 j 51 z
4 E 20 U 36 k 52 0
5 F 21 V 37 l 53 1
6 G 22 W 38 m 54 2
7 H 23 X 39 n 55 3
8 I 24 Y 40 o 56 4
9 J 25 Z 41 p 57 5
10 K 26 a 42 q 58 6
11 L 27 b 43 r 59 7
12 M 28 c 44 s 60 8
13 N 29 d 45 t 61 9
14 O 30 e 46 u 62 +
15 P 31 f 47 v 63 /


When the number of bytes to encode is not dividable by 3, that is there are only one or two bytes of input for the last block, then the following action is performed:
Add extra bytes with value zero so there are three bytes, and perform the conversion to base64. If there was only one significant input byte, only the first two base64 digits are picked, and if there were two significant input bytes, the first three base64 digits are picked. '=' characters might be added to make the last block contain four base64 characters.

As a result:
When the last group contains one octet the four least significant bits of the final 6-bit block are set to zero, and when the last group contains two octets the two least significant bits of the final 6-bit block are set to zero.

Padding

The '' sequence indicates that the last group contained only 1 byte, and '=' indicates that it contained 2 bytes. The example below illustrates how truncating the input of the whole of the above quote changes the output padding:
Input ends with: any carnal pleasure. Output ends with: YW55IGNhcm5hbCBwbGVhc3VyZS4=
Input ends with:
any carnal pleasure
Output ends with: YW55IGNhcm5hbCBwbGVhc3VyZQ
Input ends with: any carnal pleasur Output ends with: YW55IGNhcm5hbCBwbGVhc3Vy
Input ends with:
any carnal pleasu
Output ends with: YW55IGNhcm5hbCBwbGVhc3U=
Input ends with: any carnal pleas Output ends with: YW55IGNhcm5hbCBwbGVhcw

The same characters will be encoded differently depending on their position within the three-octet group which is encoded to produce the four characters. For example

The Input:
pleasure.
Encodes to cGxlYXN1cmUu
The Input: leasure. Encodes to bGVhc3VyZS4=
The Input:
easure.
Encodes to ZWFzdXJlLg
The Input: asure. Encodes to YXN1cmUu
The Input:
sure. Encodes to
c3VyZS4=

The number of output bytes per input byte is approximately 4 / 3 (33% overhead) and converges to that value for large number of bytes. More specifically, given an input of
n bytes, the output will be bytes long, including padding characters.

From a theoretical point of view the padding character is not needed, since the number of missing bytes can be calculated from the number of Base64 digits. In some implementations the padding character is mandatory to use, for others it is not used. One case where padding characters are required is when multiple Base64 encoded files are concatenated. The 2011 DEF-CON Capture the Flag (CTF) qualifiers contained a puzzle with a file of concatenated Base64 encoded files.

Decoding Base64 with padding

When decoding Base64 text 4 characters are typically converted back to 3 characters. The only exceptions are when padding characters exist. A single '=' indicates the 4 characters will decode to only 2 bytes, while 2 '='s indicates the 4 characters will decode to a single byte. This example illustrates:

Encoded text ends with: YW55IGNhcm5hbCBwbGVhcw 2 '='s decodes to 1 character: any carnal pleas
Encoded text ends with: YW55IGNhcm5hbCBwbGVhc3U= 1 '=' decodes to 2 characters: any carnal pleasu
Encoded text ends with: YW55IGNhcm5hbCBwbGVh
c3Vy 0 '='s decodes to 3 characters: any carnal pleasur

Variants summary table

Implementations may have some constraints on the alphabet used for representing some bit patterns. This notably concerns the last two characters used in the index table for index 62 and 63, and the character used for padding (which may be mandatory in some protocols, or removed in others). The table below summarizes these known variants, and link to the subsections below.
Variant Char for index 62 Char for index 63 pad char Fixed encoded line-length Maximum encoded line length Line separators Characters outside alphabet Line checksum
Original Base64 for Privacy-Enhanced Mail (PEM) (RFC 1421, deprecated) + / = (mandatory) Yes (except last line) 64 CR+LF Forbidden (none)
Base64 transfer encoding for MIME (RFC 2045) + / = (mandatory) No (variable) 76 CR+LF Accepted (discarded) (none)
Standard 'Base64' encoding for RFC 3548 or RFC 4648 + / = (mandatory) Yes (except last line) 64 or 76 (only if line separators are specified and needed) CR+LF (only if specified and needed) Forbidden (none)
'Radix-64' encoding for OpenPGP  (RFC 4880) + / = (mandatory) No (variable) 76 CR+LF Forbidden 24-bit CRC (Radix-64-encoded, including one pad character)
Modified Base64 encoding for UTF-7 (RFC 1642, obsoleted) + / (none) No (variable) (none) (none) Forbidden (none)
Modified Base64 for filenames (non standard) + - (none) No (variable) (filesystem limit, generally 255) (none) Forbidden (none)
Base64 with URL and Filename Safe Alphabet (RFC 4648 'base64url' encoding) - _ (optional, not recommended, if present must be URL encoded
Percent-encoding
Percent-encoding, also known as URL encoding, is a mechanism for encoding information in a Uniform Resource Identifier under certain circumstances. Although it is known as URL encoding it is, in fact, used more generally within the main Uniform Resource Identifier set, which includes both Uniform...

 as
%3D)
No (variable) (application-dependent) (none) Forbidden (none)
Modified Base64 for XML name tokens (Nmtoken) . - (none) No (variable) (XML parser-dependent) (none) Forbidden (none)
Modified Base64 for XML identifiers (Name) _ : (none) No (variable) (XML parser-dependent) (none) Forbidden (none)
Modified Base64 for Program identifiers (variant 1, non standard) _ - (none) No (variable) (language/system-dependent) (none) Forbidden (none)
Modified Base64 for Program identifiers (variant 2, non standard) . _ (none) No (variable) (language/system-dependent) (none) Forbidden (none)
Modified Base64 for Regular expressions (non standard) ! - (none) No (variable) (application-dependent) (none) Forbidden (none)

Privacy-enhanced mail

The first known standardized use of the encoding now called MIME Base64 was in the Privacy-enhanced Electronic Mail
Privacy-enhanced Electronic Mail
Privacy Enhanced Mail , is a 1993 IETF proposal for securing email using public-key cryptography. Although PEM became an IETF proposed standard it was never widely deployed or used....

 (PEM) protocol, proposed by RFC 989 in 1987. PEM defines a "printable encoding" scheme that uses Base64 encoding to transform an arbitrary sequence of octets
Octet (computing)
An octet is a unit of digital information in computing and telecommunications that consists of eight bits. The term is often used when the term byte might be ambiguous, as there is no standard for the size of the byte.-Overview:...

 to a format that can be expressed in short lines of 6-bit characters, as required by transfer protocols such as SMTP.

The current version of PEM (specified in RFC 1421) uses a 64-character alphabet consisting of upper- and lower-case Roman alphabet characters (AZ, az), the numerals (09), and the "+" and "/" symbols. The "=" symbol is also used as a special suffix code. The original specification, RFC 989, additionally used the "*" symbol to delimit encoded but unencrypted data within the output stream.

To convert data to PEM printable encoding, the first byte is placed in the most significant
Most significant bit
In computing, the most significant bit is the bit position in a binary number having the greatest value...

 eight bits of a 24-bit buffer, the next in the middle eight, and the third in the least significant
Least significant bit
In computing, the least significant bit is the bit position in a binary integer giving the units value, that is, determining whether the number is even or odd. The lsb is sometimes referred to as the right-most bit, due to the convention in positional notation of writing less significant digits...

 eight bits. If there are fewer than three bytes left to encode (or in total), the remaining buffer bits will be zero. The buffer is then used, six bits at a time, most significant first, as indices into the string: "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/", and the indicated character is output.

The process is repeated on the remaining data until fewer than four octets remain. If three octets remain, they are processed normally. If fewer than three octets (24 bits) are remaining to encode, the input data is right-padded with zero bits to form an integral multiple of six bits.

After encoding the non-padded data, if two octets of the 24-bit buffer are padded-zeros, two "=" characters are appended to the output; if one octet of the 24-bit buffer is filled with padded-zeros, one "=" character is appended. This signals the decoder that the zero bits added due to padding should be excluded from the reconstructed data. This also guarantees that the encoded output length is a multiple of 4 bytes.

PEM requires that all encoded lines consist of exactly 64 printable characters, with the exception of the last line, which may contain fewer printable characters. Lines are delimited by whitespace characters according to local (platform-specific) conventions.

MIME

The MIME
MIME
Multipurpose Internet Mail Extensions is an Internet standard that extends the format of email to support:* Text in character sets other than ASCII* Non-text attachments* Message bodies with multiple parts...

 (Multipurpose Internet Mail Extensions) specification lists Base64 as one of two binary-to-text encoding schemes (the other being quoted-printable
Quoted-printable
Quoted-printable, or QP encoding, is an encoding using printable ASCII characters to transmit 8-bit data over a 7-bit data path or, generally, over a medium which is not 8-bit clean...

). MIME's Base64 encoding is based on that of the RFC 1421 version of PEM: it uses the same 64-character alphabet and encoding mechanism as PEM, and uses the "=" symbol for output padding in the same way, as described at RFC 1521.

MIME does not specify a fixed length for Base64-encoded lines, but it does specify a maximum line length of 76 characters. Additionally it specifies that any extra-alphabetic characters must be ignored by a compliant decoder, although most implementations use a CR/LF newline
Newline
In computing, a newline, also known as a line break or end-of-line marker, is a special character or sequence of characters signifying the end of a line of text. The name comes from the fact that the next character after the newline will appear on a new line—that is, on the next line below the...

 pair to delimit encoded lines.

Thus, the actual length of MIME-compliant Base64-encoded binary data is usually about 137% of the original data length, though for very short messages the overhead can be a lot higher because of the overhead of the headers. Very roughly, the final size of Base64-encoded binary data is equal to 1.37 times the original data size + 814 bytes (for headers). In other words, you can approximate the size of the decoded data with this formula:
bytes = (string_length(encoded_string) - 814) / 1.37

UTF-7

UTF-7
UTF-7
UTF-7 is a variable-length character encoding that was proposed for representing Unicode text using a stream of ASCII characters...

, described first in RFC 1642, which was later superseded by RFC 2152, introduced a system called modified Base64. This data encoding scheme is used to encode UTF-16 as ASCII
ASCII
The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...

 characters for use in 7-bit transports such as SMTP. It is a variant of the Base64 encoding used in MIME.

The "Modified Base64" alphabet consists of the MIME Base64 alphabet, but does not use the "=" padding character. UTF-7 is intended for use in mail headers (defined in RFC 2047), and the "=" character is reserved in that context as the escape character for "quoted-printable" encoding. Modified Base64 simply omits the padding and ends immediately after the last Base64 digit containing useful bits leaving up to three unused bits in the last Base64 digit.

OpenPGP

OpenPGP, described in RFC 4880, describes Radix-64 encoding, also known as "ASCII Armor". Radix-64 is identical to the "Base64" encoding described from MIME, with the addition of an optional 24-bit CRC
Cyclic redundancy check
A cyclic redundancy check is an error-detecting code commonly used in digital networks and storage devices to detect accidental changes to raw data...

 checksum. The checksum is calculated on the input data before encoding; the checksum is then encoded with the same Base64 algorithm and, using an additional "=" symbol as separator, appended to the encoded output data.

RFC 3548

RFC 3548, entitled The Base16, Base32, and Base64 Data Encodings, is an informational (non-normative) memo that attempts to unify the RFC 1421 and RFC 2045 specifications of Base64 encodings, alternative-alphabet encodings, and the seldom-used Base32 and Base16 encodings.

RFC 3548 forbids implementations from generating messages containing characters outside the encoding alphabet or without padding, unless they are written to a specification that refers to RFC 3548 and specifically requires otherwise; it also declares that decoder implementations must reject data that contains characters outside the encoding alphabet, unless they are written to a specification that refers to RFC 3548 and specifically requires otherwise.

RFC 4648

This RFC obsoletes RFC 3548 and focuses on Base64/32/16:
This document describes the commonly used Base64, Base32, and Base16 encoding schemes. It also discusses the use of line-feeds in encoded data, use of padding in encoded data, use of non-alphabet characters in encoded data, use of different encoding alphabets, and canonical encodings.

Filenames

Another variant called modified Base64 for filename uses '-' instead of '/', because Unix and Windows filenames cannot contain '/'.

URL applications

Base64 encoding can be helpful when fairly lengthy identifying information is used in an HTTP environment. For example, a database persistence framework for Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

 objects might use Base64 encoding to encode a relatively large unique id (generally 128-bit UUIDs) into a string for use as an HTTP parameter in HTTP forms or HTTP GET URLs
Uniform Resource Locator
In computing, a uniform resource locator or universal resource locator is a specific character string that constitutes a reference to an Internet resource....

. Also, many applications need to encode binary data in a way that is convenient for inclusion in URLs, including in hidden web form fields, and Base64 is a convenient encoding to render them in not only a compact way, but in a relatively unreadable one when trying to obscure the nature of data from a casual human observer.

Using standard Base64 in URL requires encoding of '+', '/' and '=' characters into special percent-encoded
Percent-encoding
Percent-encoding, also known as URL encoding, is a mechanism for encoding information in a Uniform Resource Identifier under certain circumstances. Although it is known as URL encoding it is, in fact, used more generally within the main Uniform Resource Identifier set, which includes both Uniform...

 hexadecimal sequences ('+' = '%2B', '/' = '%2F' and '=' = '%3D'), which makes the string unnecessarily longer.

For this reason, a modified Base64 for URL variant exists, where no padding '=' will be used, and the '+' and '/' characters of standard Base64 are respectively replaced by '-' and '_', so that using URL encoders/decoders
Percent-encoding
Percent-encoding, also known as URL encoding, is a mechanism for encoding information in a Uniform Resource Identifier under certain circumstances. Although it is known as URL encoding it is, in fact, used more generally within the main Uniform Resource Identifier set, which includes both Uniform...

 are no longer necessary and have no impact on the length of the encoded value, leaving the same encoded form intact for use in relational databases, web forms, and object identifiers in general.

Program identifiers

There are other variants that use '_-' or '._' when the Base64 variant string must be used within valid identifiers for programs.

XML

XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....

 identifiers and name tokens are encoded using two variants:
  • '.-' for use in XML
    XML
    Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....

     name tokens (Nmtoken), or even
  • '_:' for use in more restricted XML identifiers (Name).

Regular expressions

Another variant called modified Base64 for regexps uses '!-' instead of '*-' to replace the standard Base64 '+/', because both '+' and '*' may be reserved for regular expressions (note that '[]' used in the IRCu variant above would not work in that context).

HTML

The atob and btoa JavaScript methods, defined in the HTML5 draft specification, provide Base64 encoding and decoding functionality to web pages. The atob method is unusual in that it does not ignore whitespace or new lines, throwing an INVALID_CHARACTER_ERR instead. The btoa method outputs padding characters, but these are optional in the input of the atob method.

Other applications

Base64 can be used in a variety of contexts:
  • Base64 can be used to transmit and store text that might otherwise cause delimiter collision
  • Base64 is often used as a quick but insecure shortcut to obscure secrets without incurring the overhead of cryptographic key management
    Key management
    Key management is the provisions made in a cryptography system design that are related to generation, exchange, storage, safeguarding, use, vetting, and replacement of keys. It includes cryptographic protocol design, key servers, user procedures, and other relevant protocols.Key management concerns...

    . For example, Evolution
    Novell Evolution
    Evolution or Novell Evolution is the official personal information manager and workgroup information management tool for GNOME. It combines e-mail, calendar, address book, and task list management functions. It has been an official part of GNOME since version 2.8 in September 2004...

     and Thunderbird
    Mozilla Thunderbird
    Mozilla Thunderbird is a free, open source, cross-platform e-mail and news client developed by the Mozilla Foundation. The project strategy is modeled after Mozilla Firefox, a project aimed at creating a web browser...

     use Base64 to obfuscate
    Obfuscation
    Obfuscation is the hiding of intended meaning in communication, making communication confusing, wilfully ambiguous, and harder to interpret.- Background :Obfuscation may be used for many purposes...

     e-mail passwords.
  • Base64 is used to store a password hash computed with crypt in the /etc/passwd
    Passwd (file)
    In Unix-like operating systems the /etc/passwd file is a text-based database of information about users that may login to the system or other operating system user identities that own running processes....

  • Spammers
    Spam (electronic)
    Spam is the use of electronic messaging systems to send unsolicited bulk messages indiscriminately...

     use Base64 to evade basic anti-spamming tools, which often do not decode Base64 and therefore cannot detect keywords in encoded messages.
  • Base64 is heavily used for PHP
    PHP
    PHP is a general-purpose server-side scripting language originally designed for web development to produce dynamic web pages. For this purpose, PHP code is embedded into the HTML source document and interpreted by a web server with a PHP processor module, which generates the web page document...

     obfuscation
    Obfuscated code
    Obfuscated code is source or machine code that has been made difficult to understand for humans. Programmers may deliberately obfuscate code to conceal its purpose or its logic to prevent tampering, deter reverse engineering, or as a puzzle or recreational challenge for someone reading the source...

    .
  • Base64 is used to encode character strings in LDIF files
  • Base64 is often used to embed binary data in an XML
    XML
    Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....

     file, using a syntax similar to e.g. favicon
    Favicon
    A favicon , also known as a shortcut icon, Web site icon, URL icon, or bookmark icon, is a file containing one small icons, most commonly 16×16 pixels, associated with a particular Web site or Web page...

    s in Firefox's bookmarks.html.
  • Base64 is used to encode binary files such as images within scripts, to avoid depending on external files.
  • The data URI scheme can use Base64 to represent file contents. For instance, background images and fonts can be specified in a CSS
    Cascading Style Sheets
    Cascading Style Sheets is a style sheet language used to describe the presentation semantics of a document written in a markup language...

     stylesheet file as data: URIs, instead of being supplied in separate files.

Radix 64 applications not compatible with Base64

  • The GEDCOM
    GEDCOM
    GEDCOM, an acronym for GEnealogical Data COMmunication, is a proprietary and open de facto specification for exchanging genealogical data between different genealogy software...

     5.5 standard for Genealogical data interchange uses a concept similar to Base64 to encode multimedia files in its text-line hierarchical file format. The choice of extra characters are '.' and '/' with a different assignment of characters to the 64 6-bit values, that is ., /, 09, AZ, az for values 0–63.
  • Uuencoding uses a system with base 64 for binary data, but with much different set of characters in the encoding. It uses many punctation characters, but no lower-case letter.
  • BinHex
    BinHex
    BinHex, short for "binary-to-hexadecimal", is a binary-to-text encoding system that was used on the Mac OS for sending binary files through e-mail. It is similar to Uuencode, but combined both "forks" of the Mac file system together, along with extended file information...

    , which was used for the Mac OS
    Mac OS
    Mac OS is a series of graphical user interface-based operating systems developed by Apple Inc. for their Macintosh line of computer systems. The Macintosh user experience is credited with popularizing the graphical user interface...

     has an encoding system with 64 as base, but with different characters compared to Base64. It uses punctuation characters, digits, upper and lower case letters, but does not use some visually confusable characters like '7', 'O', 'g' and 'o'.

See also

  • Base32
  • Base16
  • Ascii85
    Ascii85
    Ascii85 is a form of binary-to-text encoding developed by Paul E. Rutter for the btoa utility. By using five ASCII characters to represent four bytes of binary data , it is more efficient than uuencode or Base64, which use four characters to represent three bytes of data...

  • Quoted-printable
    Quoted-printable
    Quoted-printable, or QP encoding, is an encoding using printable ASCII characters to transmit 8-bit data over a 7-bit data path or, generally, over a medium which is not 8-bit clean...

  • uuencode
    Uuencode
    Uuencoding is a form of binary-to-text encoding that originated in the Unix program uuencode, for encoding binary data for transmission over the uucp mail system.The name "uuencoding" is derived from "Unix-to-Unix encoding"...

  • yEnc
    YEnc
    yEnc is a binary-to-text encoding scheme for transferring binary files in messages on Usenet or via e-mail. It reduces the overhead over previous US-ASCII-based encoding methods by using an 8-bit Extended ASCII encoding method...

  • 8BITMIME
  • URL
    Uniform Resource Locator
    In computing, a uniform resource locator or universal resource locator is a specific character string that constitutes a reference to an Internet resource....


External links
  • RFC 989 and RFC 1421 (Privacy Enhancement for Electronic Internet Mail)
  • RFC 2045 (Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies)
  • RFC 3548 and RFC 4648 (The Base16, Base32, and Base64 Data Encodings)
  • Implementations available for ANSI C, C++, C#, Java, Perl, Python, Ruby
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK