List of XML and HTML character entity references
Encyclopedia
In SGML, HTML
and XML
documents, the logical constructs known as character data and attribute values consist of sequences of character
s, in which each character can manifest directly (representing itself), or can be represented by a series of characters called a character reference, of which there are two types: a numeric character reference
and a character entity reference
. This article lists the character entity references that are valid in HTML and XML documents.
/Unicode
code point, and uses the format
or
where nnnn is the code point in decimal
form, and hhhh is the code point in hexadecimal
form. The x must be lowercase in XML documents. The nnnn or hhhh may be any number of digits and may include leading zeros. The hhhh may mix uppercase and lowercase, though uppercase is the usual style.
In contrast, a character entity reference refers to a character by the name of an entity
which has the desired character as its replacement text. The entity must either be predefined (built-in to the markup language) or explicitly declared in a Document Type Definition
(DTD). The format is the same as for any entity reference:
where name is the name of the entity. The semicolon is required.
The table below lists the five XML predefined entities. The "Name" column mentions the entity's name. The "Character" column shows the character. In order to render the character, the format
In the table below, the "Standard" column indicates the first version of the HTML DTD that defines the character entity reference. HTML 4.01 did not provide any new character references.
Notes:
DTD
s explicitly declare 253 entities (including the 5 predefined entities of XML 1.0) whose expansion is a single character, which can therefore be informally referred to as "character entities". These (with the exception of the
, XHTML documents may reference the predefined
Because of the special
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....
and XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....
documents, the logical constructs known as character data and attribute values consist of sequences of character
Character (computing)
In computer and machine-based telecommunications terminology, a character is a unit of information that roughly corresponds to a grapheme, grapheme-like unit, or symbol, such as in an alphabet or syllabary in the written form of a natural language....
s, in which each character can manifest directly (representing itself), or can be represented by a series of characters called a character reference, of which there are two types: a numeric character reference
Numeric character reference
A numeric character reference is a common markup construct used in SGML and other SGML-related markup languages such as HTML and XML. It consists of a short sequence of characters that, in turn, represent a single character from the Universal Character Set of Unicode...
and a character entity reference
Character entity reference
In the markup languages SGML, HTML, XHTML and XML, a character entity reference is a reference to a particular kind of named entity that has been predefined or explicitly declared in a Document Type Definition . The "replacement text" of the entity consists of a single character from the Universal...
. This article lists the character entity references that are valid in HTML and XML documents.
Character reference overview
A numeric character reference refers to a character by its Universal Character SetUniversal Character Set
The Universal Character Set , defined by the International Standard ISO/IEC 10646, Information technology — Universal multiple-octet coded character set , is a standard set of characters upon which many character encodings are based...
/Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...
code point, and uses the format
nnnn
;
or
hhhh
;
where nnnn is the code point in decimal
Decimal
The decimal numeral system has ten as its base. It is the numerical base most widely used by modern civilizations....
form, and hhhh is the code point in hexadecimal
Hexadecimal
In mathematics and computer science, hexadecimal is a positional numeral system with a radix, or base, of 16. It uses sixteen distinct symbols, most often the symbols 0–9 to represent values zero to nine, and A, B, C, D, E, F to represent values ten to fifteen...
form. The x must be lowercase in XML documents. The nnnn or hhhh may be any number of digits and may include leading zeros. The hhhh may mix uppercase and lowercase, though uppercase is the usual style.
In contrast, a character entity reference refers to a character by the name of an entity
SGML entity
In the Standard Generalized Markup Language , an entity is a primitive data type, which associates a string with either a unique alias or an SGML reserved word . Entities are foundational to the organizational structure and definition of SGML documents...
which has the desired character as its replacement text. The entity must either be predefined (built-in to the markup language) or explicitly declared in a Document Type Definition
Document Type Definition
Document Type Definition is a set of markup declarations that define a document type for SGML-family markup languages...
(DTD). The format is the same as for any entity reference:
&
name;
where name is the name of the entity. The semicolon is required.
Predefined entities in XML
The XML specification does not use the term "character entity" or "character entity reference". The XML specification defines five "predefined entities" representing special characters, and requires that all XML processors honor them. The entities can be explicitly declared in a DTD, as well, but if this is done, the replacement text must be the same as the built-in definitions. XML also allows other named entities of any size to be defined on a per-document basis.The table below lists the five XML predefined entities. The "Name" column mentions the entity's name. The "Character" column shows the character. In order to render the character, the format
&name;
is used; for example, &
renders as &. The "Unicode code point" column cites the character via standard UCS/Unicode "U+" notation, which shows the character's code point in hexadecimal. The decimal equivalent of the code point is then shown in parentheses. The "Standard" column indicates the first version of XML that includes the entity. The "Description" column cites the character via its canonical UCS/Unicode name, in English.Name | Character | Unicode code point (decimal) | Standard | Description |
---|---|---|---|---|
quot | " | U+0022 (34) | XML 1.0 | double quotation mark Quotation mark Quotation marks or inverted commas are punctuation marks at the beginning and end of a quotation, direct speech, literal title or name. Quotation marks can also be used to indicate a different meaning of a word or phrase than the one typically associated with it and are often used to express irony... |
amp | & | U+0026 (38) | XML 1.0 | ampersand Ampersand An ampersand is a logogram representing the conjunction word "and". The symbol is a ligature of the letters in et, Latin for "and".-Etymology:... |
apos | ' | U+0027 (39) | XML 1.0 | apostrophe Apostrophe The apostrophe is a punctuation mark, and sometimes a diacritic mark, in languages that use the Latin alphabet or certain other alphabets... (= apostrophe-quote) |
lt | < | U+003C (60) | XML 1.0 | less-than sign |
gt | > | U+003E (62) | XML 1.0 | greater-than sign Greater-than sign -Computing:The greater-than sign is an original ASCII character .-Angle brackets:The greater-than sign is used for an approximation of the closing angle bracket . ASCII does not have angular brackets.-Programming language:... |
Character entity references in HTML
The HTML 4 DTDs define 252 named entities, references to which act as mnemonic aliases for certain Unicode characters. The HTML 4 specification requires the use of the standard DTDs and does not allow users to define additional entities.In the table below, the "Standard" column indicates the first version of the HTML DTD that defines the character entity reference. HTML 4.01 did not provide any new character references.
Name | Character | Unicode code point (decimal) | Standard | DTD | Old ISO subset | Description |
---|---|---|---|---|---|---|
quot | " | U+0022 (34) | HTML 2.0 | HTMLspecial | ISOnum | quotation mark Quotation mark Quotation marks or inverted commas are punctuation marks at the beginning and end of a quotation, direct speech, literal title or name. Quotation marks can also be used to indicate a different meaning of a word or phrase than the one typically associated with it and are often used to express irony... (= APL quote) |
amp | & | U+0026 (38) | HTML 2.0 | HTMLspecial | ISOnum | ampersand Ampersand An ampersand is a logogram representing the conjunction word "and". The symbol is a ligature of the letters in et, Latin for "and".-Etymology:... |
apos | ' | U+0027 (39) | XHTML 1.0 | HTMLspecial | ISOnum | apostrophe Apostrophe The apostrophe is a punctuation mark, and sometimes a diacritic mark, in languages that use the Latin alphabet or certain other alphabets... (= apostrophe-quote); see below |
lt | < | U+003C (60) | HTML 2.0 | HTMLspecial | ISOnum | less-than sign |
gt | > | U+003E (62) | HTML 2.0 | HTMLspecial | ISOnum | greater-than sign Greater-than sign -Computing:The greater-than sign is an original ASCII character .-Angle brackets:The greater-than sign is used for an approximation of the closing angle bracket . ASCII does not have angular brackets.-Programming language:... |
nbsp | U+00A0 (160) | HTML 3.2 | HTMLlat1 | ISOnum | no-break space (= non-breaking space Non-breaking space In computer-based text processing and digital typesetting, a non-breaking space or no-break space is a variant of the space character that prevents an automatic line break at its position. In certain formats , it also prevents the “collapsing” of multiple consecutive whitespace characters into a... ) |
|
iexcl | ¡ | U+00A1 (161) | HTML 3.2 | HTMLlat1 | ISOnum | inverted exclamation mark |
cent | ¢ | U+00A2 (162) | HTML 3.2 | HTMLlat1 | ISOnum | cent sign Cent (currency) In many national currencies, the cent is a monetary unit that equals 1⁄100 of the basic monetary unit. Etymologically, the word cent derives from the Latin word "centum" meaning hundred. Cent also refers to a coin which is worth one cent.... |
pound | £ | U+00A3 (163) | HTML 3.2 | HTMLlat1 | ISOnum | pound sign Pound sign The pound sign is the symbol for the pound sterling—the currency of the United Kingdom . The same symbol is used for similarly named currencies in some other countries and territories, such as the Irish pound, Gibraltar pound, Australian pound and the Italian lira... |
curren | ¤ | U+00A4 (164) | HTML 3.2 | HTMLlat1 | ISOnum | currency sign Currency (typography) The currency sign is a character used to denote a currency, when the symbol for a particular currency is unavailable. It is particularly common in place of symbols, such as that of the Colón , which are absent from most character sets and fonts... |
yen | ¥ | U+00A5 (165) | HTML 3.2 | HTMLlat1 | ISOnum | yen sign Japanese yen The is the official currency of Japan. It is the third most traded currency in the foreign exchange market after the United States dollar and the euro. It is also widely used as a reserve currency after the U.S. dollar, the euro and the pound sterling... (= yuan Chinese yuan The yuan is the base unit of a number of modern Chinese currencies. The yuan is the primary unit of account of the Renminbi.A yuán is also known colloquially as a kuài . One yuán is divided into 10 jiǎo or colloquially máo... sign) |
brvbar | ¦ | U+00A6 (166) | HTML 3.2 | HTMLlat1 | ISOnum | broken bar (= broken vertical bar) |
sect | § | U+00A7 (167) | HTML 3.2 | HTMLlat1 | ISOnum | section sign Section sign The section sign , also called the "double S", "sectional symbol" or signum sectiōnis, is a typographical character used mainly to refer to a particular section of a document, such as a legal code. It is frequently used along with the pilcrow , or paragraph sign... |
uml | ¨ | U+00A8 (168) | HTML 3.2 | HTMLlat1 | ISOdia | diaeresis (= spacing diaeresis); see Germanic umlaut Germanic umlaut In linguistics, umlaut is a process whereby a vowel is pronounced more like a following vowel or semivowel. The term umlaut was originally coined and is used principally in connection with the study of the Germanic languages... |
copy | © | U+00A9 (169) | HTML 3.2 | HTMLlat1 | ISOnum | copyright symbol Copyright symbol The copyright symbol, or copyright sign, designated by © , is the symbol used in copyright notices for works other than sound recordings . The use of the symbol is described in United States copyright law, and, internationally, by the Universal Copyright Convention... |
ordf | ª | U+00AA (170) | HTML 3.2 | HTMLlat1 | ISOnum | feminine ordinal indicator Ordinal indicator In written languages, an ordinal indicator is a sign adjacent to a numeral denoting that it is an ordinal number, rather than a cardinal number. The exact sign used varies in different languages.- English :... |
laquo | « | U+00AB (171) | HTML 3.2 | HTMLlat1 | ISOnum | left-pointing double angle quotation mark (= left pointing guillemet) |
not | ¬ | U+00AC (172) | HTML 3.2 | HTMLlat1 | ISOnum | not sign |
shy | U+00AD (173) | HTML 3.2 | HTMLlat1 | ISOnum | soft hyphen Soft hyphen In computing and typesetting, a soft hyphen is a type of hyphen used to specify a place in text where a hyphenated break is allowed without forcing a line break in an inconvenient place if the text is re-flowed.... (= discretionary hyphen) |
|
reg | ® | U+00AE (174) | HTML 3.2 | HTMLlat1 | ISOnum | registered sign ( = registered trademark symbol Registered trademark symbol The registered trademark symbol, designated by ® , is a symbol used to provide notice that the preceding mark is a trademark or service mark that has been registered with a national trademark office... ) |
macr | ¯ | U+00AF (175) | HTML 3.2 | HTMLlat1 | ISOdia | macron Macron A macron, from the Greek , meaning "long", is a diacritic placed above a vowel . It was originally used to mark a long or heavy syllable in Greco-Roman metrics, but now marks a long vowel... (= spacing macron = overline = APL overbar) |
deg | ° | U+00B0 (176) | HTML 3.2 | HTMLlat1 | ISOnum | degree symbol Degree symbol The degree symbol is a typographical symbol that is used, among other things, to represent degrees of arc or degrees of temperature... |
plusmn | ± | U+00B1 (177) | HTML 3.2 | HTMLlat1 | ISOnum | plus-minus sign Plus-minus sign The plus-minus sign is a mathematical symbol commonly used either*to indicate the precision of an approximation, or*to indicate a value that can be of either sign.... (= plus-or-minus sign) |
sup2 | ² | U+00B2 (178) | HTML 3.2 | HTMLlat1 | ISOnum | superscript two (= superscript digit two = squared) |
sup3 | ³ | U+00B3 (179) | HTML 3.2 | HTMLlat1 | ISOnum | superscript three (= superscript digit three = cubed) |
acute | ´ | U+00B4 (180) | HTML 3.2 | HTMLlat1 | ISOdia | acute accent Acute accent The acute accent is a diacritic used in many modern written languages with alphabets based on the Latin, Cyrillic, and Greek scripts.-Apex:An early precursor of the acute accent was the apex, used in Latin inscriptions to mark long vowels.-Greek:... (= spacing acute) |
micro | µ | U+00B5 (181) | HTML 3.2 | HTMLlat1 | ISOnum | micro sign |
para | ¶ | U+00B6 (182) | HTML 3.2 | HTMLlat1 | ISOnum | pilcrow Pilcrow The pilcrow , also called the paragraph mark, paragraph sign, paraph, alinea , or blind P, is a typographical character commonly used to denote individual paragraphs... sign ( = paragraph sign) |
middot | · | U+00B7 (183) | HTML 3.2 | HTMLlat1 | ISOnum | middle dot Interpunct An interpunct —also called an interpoint—is a small dot used for interword separation in ancient Latin script, which also appears in some modern languages as a stand-alone sign inside a word. It is present in Unicode as code point .... (= Georgian comma = Greek middle dot) |
cedil | ¸ | U+00B8 (184) | HTML 3.2 | HTMLlat1 | ISOdia | cedilla Cedilla A cedilla , also known as cedilha or cédille, is a hook added under certain letters as a diacritical mark to modify their pronunciation.-Origin:... (= spacing cedilla) |
sup1 | ¹ | U+00B9 (185) | HTML 3.2 | HTMLlat1 | ISOnum | superscript one (= superscript digit one) |
ordm | º | U+00BA (186) | HTML 3.2 | HTMLlat1 | ISOnum | masculine ordinal indicator Ordinal indicator In written languages, an ordinal indicator is a sign adjacent to a numeral denoting that it is an ordinal number, rather than a cardinal number. The exact sign used varies in different languages.- English :... |
raquo | » | U+00BB (187) | HTML 3.2 | HTMLlat1 | ISOnum | right-pointing double angle quotation mark (= right pointing guillemet) |
frac14 | ¼ | U+00BC (188) | HTML 3.2 | HTMLlat1 | ISOnum | vulgar fraction one quarter (= fraction one quarter) |
frac12 | ½ | U+00BD (189) | HTML 3.2 | HTMLlat1 | ISOnum | vulgar fraction one half (= fraction one half) |
frac34 | ¾ | U+00BE (190) | HTML 3.2 | HTMLlat1 | ISOnum | vulgar fraction three quarters (= fraction three quarters) |
iquest | ¿ | U+00BF (191) | HTML 3.2 | HTMLlat1 | ISOnum | inverted question mark (= turned question mark) |
Agrave | À | U+00C0 (192) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter A with grave accent Grave accent The grave accent is a diacritical mark used in written Breton, Catalan, Corsican, Dutch, French, Greek , Italian, Mohawk, Norwegian, Occitan, Portuguese, Scottish Gaelic, Vietnamese, Welsh, Romansh, and other languages.-Greek:The grave accent was first used in the polytonic orthography of Ancient... (= Latin capital letter A grave) |
Aacute | Á | U+00C1 (193) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter A with acute accent Acute accent The acute accent is a diacritic used in many modern written languages with alphabets based on the Latin, Cyrillic, and Greek scripts.-Apex:An early precursor of the acute accent was the apex, used in Latin inscriptions to mark long vowels.-Greek:... |
Acirc | Â | U+00C2 (194) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter A with circumflex Circumflex The circumflex is a diacritic used in the written forms of many languages, and is also commonly used in various romanization and transcription schemes. It received its English name from Latin circumflexus —a translation of the Greek περισπωμένη... |
Atilde | Ã | U+00C3 (195) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter A with tilde Tilde The tilde is a grapheme with several uses. The name of the character comes from Portuguese and Spanish, from the Latin titulus meaning "title" or "superscription", though the term "tilde" has evolved and now has a different meaning in linguistics.... |
Auml | Ä | U+00C4 (196) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter A with diaeresis |
Aring | Å | U+00C5 (197) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter A with ring above (= Latin capital letter A ring) |
AElig | Æ | U+00C6 (198) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter AE (= Latin capital ligature AE) |
Ccedil | Ç | U+00C7 (199) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter C with cedilla Cedilla A cedilla , also known as cedilha or cédille, is a hook added under certain letters as a diacritical mark to modify their pronunciation.-Origin:... |
Egrave | È | U+00C8 (200) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter E with grave accent Grave accent The grave accent is a diacritical mark used in written Breton, Catalan, Corsican, Dutch, French, Greek , Italian, Mohawk, Norwegian, Occitan, Portuguese, Scottish Gaelic, Vietnamese, Welsh, Romansh, and other languages.-Greek:The grave accent was first used in the polytonic orthography of Ancient... |
Eacute | É | U+00C9 (201) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter E with acute accent Acute accent The acute accent is a diacritic used in many modern written languages with alphabets based on the Latin, Cyrillic, and Greek scripts.-Apex:An early precursor of the acute accent was the apex, used in Latin inscriptions to mark long vowels.-Greek:... |
Ecirc | Ê | U+00CA (202) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter E with circumflex Circumflex The circumflex is a diacritic used in the written forms of many languages, and is also commonly used in various romanization and transcription schemes. It received its English name from Latin circumflexus —a translation of the Greek περισπωμένη... |
Euml | Ë | U+00CB (203) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter E with diaeresis |
Igrave | Ì | U+00CC (204) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter I with grave accent Grave accent The grave accent is a diacritical mark used in written Breton, Catalan, Corsican, Dutch, French, Greek , Italian, Mohawk, Norwegian, Occitan, Portuguese, Scottish Gaelic, Vietnamese, Welsh, Romansh, and other languages.-Greek:The grave accent was first used in the polytonic orthography of Ancient... |
Iacute | Í | U+00CD (205) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter I with acute accent Acute accent The acute accent is a diacritic used in many modern written languages with alphabets based on the Latin, Cyrillic, and Greek scripts.-Apex:An early precursor of the acute accent was the apex, used in Latin inscriptions to mark long vowels.-Greek:... |
Icirc | Î | U+00CE (206) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter I with circumflex Circumflex The circumflex is a diacritic used in the written forms of many languages, and is also commonly used in various romanization and transcription schemes. It received its English name from Latin circumflexus —a translation of the Greek περισπωμένη... |
Iuml | Ï | U+00CF (207) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter I with diaeresis |
ETH | Ð | U+00D0 (208) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter Eth Eth Eth is a letter used in Old English, Icelandic, Faroese , and Elfdalian. It was also used in Scandinavia during the Middle Ages, but was subsequently replaced with dh and later d. The capital eth resembles a D with a line through the vertical stroke... |
Ntilde | Ñ | U+00D1 (209) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter N with tilde Tilde The tilde is a grapheme with several uses. The name of the character comes from Portuguese and Spanish, from the Latin titulus meaning "title" or "superscription", though the term "tilde" has evolved and now has a different meaning in linguistics.... |
Ograve | Ò | U+00D2 (210) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter O with grave accent Grave accent The grave accent is a diacritical mark used in written Breton, Catalan, Corsican, Dutch, French, Greek , Italian, Mohawk, Norwegian, Occitan, Portuguese, Scottish Gaelic, Vietnamese, Welsh, Romansh, and other languages.-Greek:The grave accent was first used in the polytonic orthography of Ancient... |
Oacute | Ó | U+00D3 (211) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter O with acute accent Acute accent The acute accent is a diacritic used in many modern written languages with alphabets based on the Latin, Cyrillic, and Greek scripts.-Apex:An early precursor of the acute accent was the apex, used in Latin inscriptions to mark long vowels.-Greek:... |
Ocirc | Ô | U+00D4 (212) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter O with circumflex Circumflex The circumflex is a diacritic used in the written forms of many languages, and is also commonly used in various romanization and transcription schemes. It received its English name from Latin circumflexus —a translation of the Greek περισπωμένη... |
Otilde | Õ | U+00D5 (213) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter O with tilde Tilde The tilde is a grapheme with several uses. The name of the character comes from Portuguese and Spanish, from the Latin titulus meaning "title" or "superscription", though the term "tilde" has evolved and now has a different meaning in linguistics.... |
Ouml | Ö | U+00D6 (214) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter O with diaeresis |
times | × | U+00D7 (215) | HTML 3.2 | HTMLlat1 | ISOnum | multiplication sign |
Oslash | Ø | U+00D8 (216) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter O with stroke (= Latin capital letter O slash) |
Ugrave | Ù | U+00D9 (217) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter U with grave accent Grave accent The grave accent is a diacritical mark used in written Breton, Catalan, Corsican, Dutch, French, Greek , Italian, Mohawk, Norwegian, Occitan, Portuguese, Scottish Gaelic, Vietnamese, Welsh, Romansh, and other languages.-Greek:The grave accent was first used in the polytonic orthography of Ancient... |
Uacute | Ú | U+00DA (218) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter U with acute accent Acute accent The acute accent is a diacritic used in many modern written languages with alphabets based on the Latin, Cyrillic, and Greek scripts.-Apex:An early precursor of the acute accent was the apex, used in Latin inscriptions to mark long vowels.-Greek:... |
Ucirc | Û | U+00DB (219) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter U with circumflex Circumflex The circumflex is a diacritic used in the written forms of many languages, and is also commonly used in various romanization and transcription schemes. It received its English name from Latin circumflexus —a translation of the Greek περισπωμένη... |
Uuml | Ü | U+00DC (220) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter U with diaeresis |
Yacute | Ý | U+00DD (221) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter Y with acute accent Acute accent The acute accent is a diacritic used in many modern written languages with alphabets based on the Latin, Cyrillic, and Greek scripts.-Apex:An early precursor of the acute accent was the apex, used in Latin inscriptions to mark long vowels.-Greek:... |
THORN | Þ | U+00DE (222) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin capital letter THORN Thorn (letter) Thorn or þorn , is a letter in the Old English, Old Norse, and Icelandic alphabets, as well as some dialects of Middle English. It was also used in medieval Scandinavia, but was later replaced with the digraph th. The letter originated from the rune in the Elder Fuþark, called thorn in the... |
szlig | ß | U+00DF (223) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter sharp s (= ess-zed); see German Eszett |
agrave | à | U+00E0 (224) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter a with grave accent Grave accent The grave accent is a diacritical mark used in written Breton, Catalan, Corsican, Dutch, French, Greek , Italian, Mohawk, Norwegian, Occitan, Portuguese, Scottish Gaelic, Vietnamese, Welsh, Romansh, and other languages.-Greek:The grave accent was first used in the polytonic orthography of Ancient... |
aacute | á | U+00E1 (225) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter a with acute accent Acute accent The acute accent is a diacritic used in many modern written languages with alphabets based on the Latin, Cyrillic, and Greek scripts.-Apex:An early precursor of the acute accent was the apex, used in Latin inscriptions to mark long vowels.-Greek:... |
acirc | â | U+00E2 (226) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter a with circumflex Circumflex The circumflex is a diacritic used in the written forms of many languages, and is also commonly used in various romanization and transcription schemes. It received its English name from Latin circumflexus —a translation of the Greek περισπωμένη... |
atilde | ã | U+00E3 (227) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter a with tilde Tilde The tilde is a grapheme with several uses. The name of the character comes from Portuguese and Spanish, from the Latin titulus meaning "title" or "superscription", though the term "tilde" has evolved and now has a different meaning in linguistics.... |
auml | ä | U+00E4 (228) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter a with diaeresis |
aring | å | U+00E5 (229) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter a with ring above |
aelig | æ | U+00E6 (230) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter ae (= Latin small ligature ae) |
ccedil | ç | U+00E7 (231) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter c with cedilla Cedilla A cedilla , also known as cedilha or cédille, is a hook added under certain letters as a diacritical mark to modify their pronunciation.-Origin:... |
egrave | è | U+00E8 (232) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter e with grave accent Grave accent The grave accent is a diacritical mark used in written Breton, Catalan, Corsican, Dutch, French, Greek , Italian, Mohawk, Norwegian, Occitan, Portuguese, Scottish Gaelic, Vietnamese, Welsh, Romansh, and other languages.-Greek:The grave accent was first used in the polytonic orthography of Ancient... |
eacute | é | U+00E9 (233) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter e with acute accent Acute accent The acute accent is a diacritic used in many modern written languages with alphabets based on the Latin, Cyrillic, and Greek scripts.-Apex:An early precursor of the acute accent was the apex, used in Latin inscriptions to mark long vowels.-Greek:... |
ecirc | ê | U+00EA (234) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter e with circumflex Circumflex The circumflex is a diacritic used in the written forms of many languages, and is also commonly used in various romanization and transcription schemes. It received its English name from Latin circumflexus —a translation of the Greek περισπωμένη... |
euml | ë | U+00EB (235) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter e with diaeresis |
igrave | ì | U+00EC (236) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter i with grave accent Grave accent The grave accent is a diacritical mark used in written Breton, Catalan, Corsican, Dutch, French, Greek , Italian, Mohawk, Norwegian, Occitan, Portuguese, Scottish Gaelic, Vietnamese, Welsh, Romansh, and other languages.-Greek:The grave accent was first used in the polytonic orthography of Ancient... |
iacute | í | U+00ED (237) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter i with acute accent Acute accent The acute accent is a diacritic used in many modern written languages with alphabets based on the Latin, Cyrillic, and Greek scripts.-Apex:An early precursor of the acute accent was the apex, used in Latin inscriptions to mark long vowels.-Greek:... |
icirc | î | U+00EE (238) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter i with circumflex Circumflex The circumflex is a diacritic used in the written forms of many languages, and is also commonly used in various romanization and transcription schemes. It received its English name from Latin circumflexus —a translation of the Greek περισπωμένη... |
iuml | ï | U+00EF (239) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter i with diaeresis |
eth | ð | U+00F0 (240) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter eth Eth Eth is a letter used in Old English, Icelandic, Faroese , and Elfdalian. It was also used in Scandinavia during the Middle Ages, but was subsequently replaced with dh and later d. The capital eth resembles a D with a line through the vertical stroke... |
ntilde | ñ | U+00F1 (241) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter n with tilde Tilde The tilde is a grapheme with several uses. The name of the character comes from Portuguese and Spanish, from the Latin titulus meaning "title" or "superscription", though the term "tilde" has evolved and now has a different meaning in linguistics.... |
ograve | ò | U+00F2 (242) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter o with grave accent Grave accent The grave accent is a diacritical mark used in written Breton, Catalan, Corsican, Dutch, French, Greek , Italian, Mohawk, Norwegian, Occitan, Portuguese, Scottish Gaelic, Vietnamese, Welsh, Romansh, and other languages.-Greek:The grave accent was first used in the polytonic orthography of Ancient... |
oacute | ó | U+00F3 (243) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter o with acute accent Acute accent The acute accent is a diacritic used in many modern written languages with alphabets based on the Latin, Cyrillic, and Greek scripts.-Apex:An early precursor of the acute accent was the apex, used in Latin inscriptions to mark long vowels.-Greek:... |
ocirc | ô | U+00F4 (244) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter o with circumflex Circumflex The circumflex is a diacritic used in the written forms of many languages, and is also commonly used in various romanization and transcription schemes. It received its English name from Latin circumflexus —a translation of the Greek περισπωμένη... |
otilde | õ | U+00F5 (245) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter o with tilde Tilde The tilde is a grapheme with several uses. The name of the character comes from Portuguese and Spanish, from the Latin titulus meaning "title" or "superscription", though the term "tilde" has evolved and now has a different meaning in linguistics.... |
ouml | ö | U+00F6 (246) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter o with diaeresis |
divide | ÷ | U+00F7 (247) | HTML 3.2 | HTMLlat1 | ISOnum | division sign (= obelus Obelus An obelus is a symbol consisting of a short horizontal line with a dot above and below. It is mainly used to represent the mathematical operation of division. It is therefore commonly referred to as the division sign.- History :The word "obelus" comes from the Greek word for a sharpened stick,... ) |
oslash | ø | U+00F8 (248) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter o with stroke (= Latin small letter o slash) |
ugrave | ù | U+00F9 (249) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter u with grave accent Grave accent The grave accent is a diacritical mark used in written Breton, Catalan, Corsican, Dutch, French, Greek , Italian, Mohawk, Norwegian, Occitan, Portuguese, Scottish Gaelic, Vietnamese, Welsh, Romansh, and other languages.-Greek:The grave accent was first used in the polytonic orthography of Ancient... |
uacute | ú | U+00FA (250) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter u with acute accent Acute accent The acute accent is a diacritic used in many modern written languages with alphabets based on the Latin, Cyrillic, and Greek scripts.-Apex:An early precursor of the acute accent was the apex, used in Latin inscriptions to mark long vowels.-Greek:... |
ucirc | û | U+00FB (251) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter u with circumflex Circumflex The circumflex is a diacritic used in the written forms of many languages, and is also commonly used in various romanization and transcription schemes. It received its English name from Latin circumflexus —a translation of the Greek περισπωμένη... |
uuml | ü | U+00FC (252) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter u with diaeresis |
yacute | ý | U+00FD (253) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter y with acute accent Acute accent The acute accent is a diacritic used in many modern written languages with alphabets based on the Latin, Cyrillic, and Greek scripts.-Apex:An early precursor of the acute accent was the apex, used in Latin inscriptions to mark long vowels.-Greek:... |
thorn | þ | U+00FE (254) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter thorn Thorn (letter) Thorn or þorn , is a letter in the Old English, Old Norse, and Icelandic alphabets, as well as some dialects of Middle English. It was also used in medieval Scandinavia, but was later replaced with the digraph th. The letter originated from the rune in the Elder Fuþark, called thorn in the... |
yuml | ÿ | U+00FF (255) | HTML 2.0 | HTMLlat1 | ISOlat1 | Latin small letter y with diaeresis |
OElig | Œ | U+0152 (338) | HTML 4.0 | HTMLspecial | ISOlat2 | Latin capital ligature oe |
oelig | œ | U+0153 (339) | HTML 4.0 | HTMLspecial | ISOlat2 | Latin small ligature oe |
Scaron | Š | U+0160 (352) | HTML 4.0 | HTMLspecial | ISOlat2 | Latin capital letter s with caron Caron A caron or háček , also known as a wedge, inverted circumflex, inverted hat, is a diacritic placed over certain letters to indicate present or historical palatalization, iotation, or postalveolar pronunciation in the orthography of some Baltic, Slavic, Finno-Lappic, and other languages.It looks... |
scaron | š | U+0161 (353) | HTML 4.0 | HTMLspecial | ISOlat2 | Latin small letter s with caron Caron A caron or háček , also known as a wedge, inverted circumflex, inverted hat, is a diacritic placed over certain letters to indicate present or historical palatalization, iotation, or postalveolar pronunciation in the orthography of some Baltic, Slavic, Finno-Lappic, and other languages.It looks... |
Yuml | Ÿ | U+0178 (376) | HTML 4.0 | HTMLspecial | ISOlat2 | Latin capital letter y with diaeresis |
fnof | ƒ | U+0192 (402) | HTML 4.0 | HTMLsymbol | ISOtech | Latin small letter f with hook (= function = florin) |
circ | ˆ | U+02C6 (710) | HTML 4.0 | HTMLspecial | ISOpub | modifier letter circumflex Circumflex The circumflex is a diacritic used in the written forms of many languages, and is also commonly used in various romanization and transcription schemes. It received its English name from Latin circumflexus —a translation of the Greek περισπωμένη... accent |
tilde | ˜ | U+02DC (732) | HTML 4.0 | HTMLspecial | ISOdia | small tilde Tilde The tilde is a grapheme with several uses. The name of the character comes from Portuguese and Spanish, from the Latin titulus meaning "title" or "superscription", though the term "tilde" has evolved and now has a different meaning in linguistics.... |
Alpha | Α | U+0391 (913) | HTML 4.0 | HTMLsymbol | Greek capital letter Alpha | |
Beta | Β | U+0392 (914) | HTML 4.0 | HTMLsymbol | Greek capital letter Beta | |
Gamma | Γ | U+0393 (915) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek capital letter Gamma |
Delta | Δ | U+0394 (916) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek capital letter Delta |
Epsilon | Ε | U+0395 (917) | HTML 4.0 | HTMLsymbol | Greek capital letter Epsilon | |
Zeta | Ζ | U+0396 (918) | HTML 4.0 | HTMLsymbol | Greek capital letter Zeta | |
Eta | Η | U+0397 (919) | HTML 4.0 | HTMLsymbol | Greek capital letter Eta | |
Theta | Θ | U+0398 (920) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek capital letter Theta |
Iota | Ι | U+0399 (921) | HTML 4.0 | HTMLsymbol | Greek capital letter Iota | |
Kappa | Κ | U+039A (922) | HTML 4.0 | HTMLsymbol | Greek capital letter Kappa | |
Lambda | Λ | U+039B (923) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek capital letter Lambda |
Mu | Μ | U+039C (924) | HTML 4.0 | HTMLsymbol | Greek capital letter Mu | |
Nu | Ν | U+039D (925) | HTML 4.0 | HTMLsymbol | Greek capital letter Nu | |
Xi | Ξ | U+039E (926) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek capital letter Xi |
Omicron | Ο | U+039F (927) | HTML 4.0 | HTMLsymbol | Greek capital letter Omicron | |
Pi | Π | U+03A0 (928) | HTML 4.0 | HTMLsymbol | Greek capital letter Pi | |
Rho | Ρ | U+03A1 (929) | HTML 4.0 | HTMLsymbol | Greek capital letter Rho | |
Sigma | Σ | U+03A3 (931) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek capital letter Sigma |
Tau | Τ | U+03A4 (932) | HTML 4.0 | HTMLsymbol | Greek capital letter Tau | |
Upsilon | Υ | U+03A5 (933) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek capital letter Upsilon |
Phi | Φ | U+03A6 (934) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek capital letter Phi |
Chi | Χ | U+03A7 (935) | HTML 4.0 | HTMLsymbol | Greek capital letter Chi | |
Psi | Ψ | U+03A8 (936) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek capital letter Psi |
Omega | Ω | U+03A9 (937) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek capital letter Omega |
alpha | α | U+03B1 (945) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek small letter alpha |
beta | β | U+03B2 (946) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek small letter beta |
gamma | γ | U+03B3 (947) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek small letter gamma |
delta | δ | U+03B4 (948) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek small letter delta |
epsilon | ε | U+03B5 (949) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek small letter epsilon |
zeta | ζ | U+03B6 (950) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek small letter zeta |
eta | η | U+03B7 (951) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek small letter eta |
theta | θ | U+03B8 (952) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek small letter theta |
iota | ι | U+03B9 (953) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek small letter iota |
kappa | κ | U+03BA (954) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek small letter kappa |
lambda | λ | U+03BB (955) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek small letter lambda |
mu | μ | U+03BC (956) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek small letter mu |
nu | ν | U+03BD (957) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek small letter nu |
xi | ξ | U+03BE (958) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek small letter xi |
omicron | ο | U+03BF (959) | HTML 4.0 | HTMLsymbol | NEW | Greek small letter omicron |
pi | π | U+03C0 (960) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek small letter pi |
rho | ρ | U+03C1 (961) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek small letter rho |
sigmaf | ς | U+03C2 (962) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek small letter final sigma |
sigma | σ | U+03C3 (963) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek small letter sigma |
tau | τ | U+03C4 (964) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek small letter tau |
upsilon | υ | U+03C5 (965) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek small letter upsilon |
phi | φ | U+03C6 (966) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek small letter phi |
chi | χ | U+03C7 (967) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek small letter chi |
psi | ψ | U+03C8 (968) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek small letter psi |
omega | ω | U+03C9 (969) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek small letter omega |
thetasym | U+03D1 (977) | HTML 4.0 | HTMLsymbol | NEW | Greek theta symbol | |
U+03D2 (978) | HTML 4.0 | HTMLsymbol | NEW | Greek Upsilon with hook symbol | ||
piv | U+03D6 (982) | HTML 4.0 | HTMLsymbol | ISOgrk3 | Greek pi symbol | |
ensp | U+2002 (8194) | HTML 4.0 | HTMLspecial | ISOpub | en space | |
emsp | U+2003 (8195) | HTML 4.0 | HTMLspecial | ISOpub | em space | |
thinsp | U+2009 (8201) | HTML 4.0 | HTMLspecial | ISOpub | thin space | |
zwnj | U+200C (8204) | HTML 4.0 | HTMLspecial | NEW RFC 2070 | zero-width non-joiner Zero-width non-joiner The zero-width non-joiner is a non-printing character used in the computerization of writing systems that make use of ligatures. When placed between two characters that would otherwise be connected into a ligature, a ZWNJ causes them to be printed in their final and initial forms, respectively... |
|
zwj | U+200D (8205) | HTML 4.0 | HTMLspecial | NEW RFC 2070 | zero-width joiner Zero-width joiner The zero-width joiner is a non-printing character used in the computerized typesetting of some complex scripts, such as the Arabic script or any of the Indic scripts. When placed between two characters that would otherwise not be connected, a ZWJ causes them to be printed in their connected... |
|
lrm | U+200E (8206) | HTML 4.0 | HTMLspecial | NEW RFC 2070 | left-to-right mark Left-to-right mark The left-to-right mark is a control character or non-printing character, used in the computerized typesetting of bi-directional text, containing mixed left-to-right scripts and right-to-left scripts... |
|
rlm | U+200F (8207) | HTML 4.0 | HTMLspecial | NEW RFC 2070 | right-to-left mark Right-to-left mark The right-to-left mark is a non-printing character used in the computerized typesetting of bi-directional text containing mixed left-to-right scripts and right-to-left scripts... |
|
ndash | – | U+2013 (8211) | HTML 4.0 | HTMLspecial | ISOpub | en dash |
mdash | — | U+2014 (8212) | HTML 4.0 | HTMLspecial | ISOpub | em dash |
lsquo | ‘ | U+2018 (8216) | HTML 4.0 | HTMLspecial | ISOnum | left single quotation mark Quotation mark Quotation marks or inverted commas are punctuation marks at the beginning and end of a quotation, direct speech, literal title or name. Quotation marks can also be used to indicate a different meaning of a word or phrase than the one typically associated with it and are often used to express irony... |
rsquo | ’ | U+2019 (8217) | HTML 4.0 | HTMLspecial | ISOnum | right single quotation mark Quotation mark Quotation marks or inverted commas are punctuation marks at the beginning and end of a quotation, direct speech, literal title or name. Quotation marks can also be used to indicate a different meaning of a word or phrase than the one typically associated with it and are often used to express irony... |
sbquo | ‚ | U+201A (8218) | HTML 4.0 | HTMLspecial | NEW | single low-9 quotation mark Quotation mark Quotation marks or inverted commas are punctuation marks at the beginning and end of a quotation, direct speech, literal title or name. Quotation marks can also be used to indicate a different meaning of a word or phrase than the one typically associated with it and are often used to express irony... |
ldquo | “ | U+201C (8220) | HTML 4.0 | HTMLspecial | ISOnum | left double quotation mark Quotation mark Quotation marks or inverted commas are punctuation marks at the beginning and end of a quotation, direct speech, literal title or name. Quotation marks can also be used to indicate a different meaning of a word or phrase than the one typically associated with it and are often used to express irony... |
rdquo | ” | U+201D (8221) | HTML 4.0 | HTMLspecial | ISOnum | right double quotation mark Quotation mark Quotation marks or inverted commas are punctuation marks at the beginning and end of a quotation, direct speech, literal title or name. Quotation marks can also be used to indicate a different meaning of a word or phrase than the one typically associated with it and are often used to express irony... |
bdquo | „ | U+201E (8222) | HTML 4.0 | HTMLspecial | NEW | double low-9 quotation mark Quotation mark Quotation marks or inverted commas are punctuation marks at the beginning and end of a quotation, direct speech, literal title or name. Quotation marks can also be used to indicate a different meaning of a word or phrase than the one typically associated with it and are often used to express irony... |
dagger | † | U+2020 (8224) | HTML 4.0 | HTMLspecial | ISOpub | dagger, obelisk Dagger (typography) A dagger, or obelisk. is a typographical symbol or glyph. The term "obelisk" derives from Greek , which means "little obelus"; from meaning "roasting spit"... |
Dagger | ‡ | U+2021 (8225) | HTML 4.0 | HTMLspecial | ISOpub | double dagger, double obelisk Dagger (typography) A dagger, or obelisk. is a typographical symbol or glyph. The term "obelisk" derives from Greek , which means "little obelus"; from meaning "roasting spit"... |
bull | • | U+2022 (8226) | HTML 4.0 | HTMLspecial | ISOpub | bullet Bullet (typography) In typography, a bullet is a typographical symbol or glyph used to introduce items in a list. For example:*Item 1*Item 2*Item 3... (= black small circle) |
hellip | … | U+2026 (8230) | HTML 4.0 | HTMLsymbol | ISOpub | horizontal ellipsis Ellipsis Ellipsis is a series of marks that usually indicate an intentional omission of a word, sentence or whole section from the original text being quoted. An ellipsis can also be used to indicate an unfinished thought or, at the end of a sentence, a trailing off into silence... (= three dot leader) |
permil | ‰ | U+2030 (8240) | HTML 4.0 | HTMLspecial | ISOtech | per mille sign |
prime | ′ | U+2032 (8242) | HTML 4.0 | HTMLsymbol | ISOtech | prime (= minutes = feet) |
Prime | ″ | U+2033 (8243) | HTML 4.0 | HTMLsymbol | ISOtech | double prime (= seconds = inches) |
lsaquo | ‹ | U+2039 (8249) | HTML 4.0 | HTMLspecial | ISO proposed | single left-pointing angle quotation mark |
rsaquo | › | U+203A (8250) | HTML 4.0 | HTMLspecial | ISO proposed | single right-pointing angle quotation mark |
oline | ‾ | U+203E (8254) | HTML 4.0 | HTMLsymbol | NEW | overline (= spacing overscore) |
frasl | ⁄ | U+2044 (8260) | HTML 4.0 | HTMLsymbol | NEW | fraction slash (= solidus Solidus (punctuation) The solidus is a punctuation mark used to indicate fractions including fractional currency. It may also be called a shilling mark, an in-line fraction bar, or a fraction slash.... ) |
euro | € | U+20AC (8364) | HTML 4.0 | HTMLspecial | NEW | euro sign Euro sign The euro sign is the currency sign used for the euro, the official currency of the Eurozone in the European Union . The design was presented to the public by the European Commission on 12 December 1996. The international three-letter code for the euro is EUR... |
image | U+2111 (8465) | HTML 4.0 | HTMLsymbol | ISOamso | black-letter capital I (= imaginary part) | |
weierp | U+2118 (8472) | HTML 4.0 | HTMLsymbol | ISOamso | script capital P (= power set = Weierstrass p Weierstrass p In mathematics, the Weierstrass p , also called pe, is used for the Weierstrass's elliptic function. It is occasionally used for the power set, although for that purpose a cursive capital, rather than lower-case, p is more widespread... ) |
|
real | U+211C (8476) | HTML 4.0 | HTMLsymbol | ISOamso | black-letter capital R (= real part symbol) | |
trade | ™ | U+2122 (8482) | HTML 4.0 | HTMLsymbol | ISOnum | trademark symbol Trademark symbol The trademark symbol, designated by ™ , is a symbol used to provide notice that the preceding mark is a trademark. Use of this symbol does not mean that the trademark has been registered. Registered trademarks are indicated using the Registered trademark symbol... |
alefsym | U+2135 (8501) | HTML 4.0 | HTMLsymbol | NEW | alef symbol Aleph number In set theory, a discipline within mathematics, the aleph numbers are a sequence of numbers used to represent the cardinality of infinite sets. They are named after the symbol used to denote them, the Hebrew letter aleph... (= first transfinite cardinal) |
|
larr | ← | U+2190 (8592) | HTML 4.0 | HTMLsymbol | ISOnum | leftwards arrow |
uarr | ↑ | U+2191 (8593) | HTML 4.0 | HTMLsymbol | ISOnum | upwards arrow |
rarr | → | U+2192 (8594) | HTML 4.0 | HTMLsymbol | ISOnum | rightwards arrow |
darr | ↓ | U+2193 (8595) | HTML 4.0 | HTMLsymbol | ISOnum | downwards arrow |
harr | ↔ | U+2194 (8596) | HTML 4.0 | HTMLsymbol | ISOamsa | left right arrow |
crarr | U+21B5 (8629) | HTML 4.0 | HTMLsymbol | NEW | downwards arrow with corner leftwards (= carriage return) | |
lArr | U+21D0 (8656) | HTML 4.0 | HTMLsymbol | ISOtech | leftwards double arrow | |
uArr | U+21D1 (8657) | HTML 4.0 | HTMLsymbol | ISOamsa | upwards double arrow | |
rArr | ⇒ | U+21D2 (8658) | HTML 4.0 | HTMLsymbol | ISOnum | rightwards double arrow |
dArr | U+21D3 (8659) | HTML 4.0 | HTMLsymbol | ISOamsa | downwards double arrow | |
hArr | ⇔ | U+21D4 (8660) | HTML 4.0 | HTMLsymbol | ISOamsa | left right double arrow |
forall | ∀ | U+2200 (8704) | HTML 4.0 | HTMLsymbol | ISOtech | for all Turned a Turned A is a letter of the Latin alphabet based upon the letter A. It is not, nor has it ever been, used in any natural languages as a letter in its own right.... |
part | ∂ | U+2202 (8706) | HTML 4.0 | HTMLsymbol | ISOtech | partial differential |
exist | ∃ | U+2203 (8707) | HTML 4.0 | HTMLsymbol | ISOtech | there exists |
empty | U+2205 (8709) | HTML 4.0 | HTMLsymbol | ISOamso | empty set (= null set = diameter) | |
nabla | ∇ | U+2207 (8711) | HTML 4.0 | HTMLsymbol | ISOtech | nabla (= backward difference) |
isin | ∈ | U+2208 (8712) | HTML 4.0 | HTMLsymbol | ISOtech | element of |
notin | U+2209 (8713) | HTML 4.0 | HTMLsymbol | ISOtech | not an element of | |
ni | ∋ | U+220B (8715) | HTML 4.0 | HTMLsymbol | ISOtech | contains as member |
prod | ∏ | U+220F (8719) | HTML 4.0 | HTMLsymbol | ISOamsb | n-ary product (= product sign) |
sum | ∑ | U+2211 (8721) | HTML 4.0 | HTMLsymbol | ISOamsb | n-ary summation Series (mathematics) A series is the sum of the terms of a sequence. Finite sequences and series have defined first and last terms, whereas infinite sequences and series continue indefinitely.... |
minus | − | U+2212 (8722) | HTML 4.0 | HTMLsymbol | ISOtech | minus sign |
lowast | U+2217 (8727) | HTML 4.0 | HTMLsymbol | ISOtech | asterisk operator | |
radic | √ | U+221A (8730) | HTML 4.0 | HTMLsymbol | ISOtech | square root (= radical sign) |
prop | ∝ | U+221D (8733) | HTML 4.0 | HTMLsymbol | ISOtech | proportional to |
infin | ∞ | U+221E (8734) | HTML 4.0 | HTMLsymbol | ISOtech | infinity |
ang | ∠ | U+2220 (8736) | HTML 4.0 | HTMLsymbol | ISOamso | angle |
and | ∧ | U+2227 (8743) | HTML 4.0 | HTMLsymbol | ISOtech | logical and (= wedge) |
or | ∨ | U+2228 (8744) | HTML 4.0 | HTMLsymbol | ISOtech | logical or (= vee) |
cap | ∩ | U+2229 (8745) | HTML 4.0 | HTMLsymbol | ISOtech | intersection Intersection (set theory) In mathematics, the intersection of two sets A and B is the set that contains all elements of A that also belong to B , but no other elements.... (= cap) |
cup | ∪ | U+222A (8746) | HTML 4.0 | HTMLsymbol | ISOtech | union Union (set theory) In set theory, the union of a collection of sets is the set of all distinct elements in the collection. The union of a collection of sets S_1, S_2, S_3, \dots , S_n\,\! gives a set S_1 \cup S_2 \cup S_3 \cup \dots \cup S_n.- Definition :... (= cup) |
int | ∫ | U+222B (8747) | HTML 4.0 | HTMLsymbol | ISOtech | integral |
there4 | ∴ | U+2234 (8756) | HTML 4.0 | HTMLsymbol | ISOtech | therefore sign Therefore sign In a mathematical proof, the therefore sign is a symbol that is sometimes placed before a logical consequence, such as the conclusion of a syllogism. The symbol consists of three dots placed in an upright triangle and is read therefore. It is encoded at . While it is not generally used in formal... |
sim | ∼ | U+223C (8764) | HTML 4.0 | HTMLsymbol | ISOtech | tilde operator (= varies with = similar to) |
cong | U+2245 (8773) | HTML 4.0 | HTMLsymbol | ISOtech | congruent to | |
asymp | ≈ | U+2248 (8776) | HTML 4.0 | HTMLsymbol | ISOamsr | almost equal to (= asymptotic to) |
ne | ≠ | U+2260 (8800) | HTML 4.0 | HTMLsymbol | ISOtech | not equal to Inequation In mathematics, an inequation is a statement that two objects or expressions are not the same, or do not represent the same value. This relation is written with a crossed-out equal sign as inx \neq y.... |
equiv | ≡ | U+2261 (8801) | HTML 4.0 | HTMLsymbol | ISOtech | identical to; sometimes used for 'equivalent to' |
le | ≤ | U+2264 (8804) | HTML 4.0 | HTMLsymbol | ISOtech | less-than or equal to |
ge | ≥ | U+2265 (8805) | HTML 4.0 | HTMLsymbol | ISOtech | greater-than or equal to |
sub | ⊂ | U+2282 (8834) | HTML 4.0 | HTMLsymbol | ISOtech | subset of |
sup | ⊃ | U+2283 (8835) | HTML 4.0 | HTMLsymbol | ISOtech | superset of |
nsub | U+2284 (8836) | HTML 4.0 | HTMLsymbol | ISOamsn | not a subset of | |
sube | ⊆ | U+2286 (8838) | HTML 4.0 | HTMLsymbol | ISOtech | subset of or equal to |
supe | ⊇ | U+2287 (8839) | HTML 4.0 | HTMLsymbol | ISOtech | superset of or equal to |
oplus | ⊕ | U+2295 (8853) | HTML 4.0 | HTMLsymbol | ISOamsb | circled plus (= direct sum) |
otimes | U+2297 (8855) | HTML 4.0 | HTMLsymbol | ISOamsb | circled times (= vector product) | |
perp | ⊥ | U+22A5 (8869) | HTML 4.0 | HTMLsymbol | ISOtech | up tack (= orthogonal to = perpendicular Perpendicular In geometry, two lines or planes are considered perpendicular to each other if they form congruent adjacent angles . The term may be used as a noun or adjective... ) |
sdot | U+22C5 (8901) | HTML 4.0 | HTMLsymbol | ISOamsb | dot operator | |
lceil | U+2308 (8968) | HTML 4.0 | HTMLsymbol | ISOamsc | left ceiling (= APL upstile) | |
rceil | U+2309 (8969) | HTML 4.0 | HTMLsymbol | ISOamsc | right ceiling | |
lfloor | U+230A (8970) | HTML 4.0 | HTMLsymbol | ISOamsc | left floor (= APL downstile) | |
rfloor | U+230B (8971) | HTML 4.0 | HTMLsymbol | ISOamsc | right floor | |
lang | U+2329 (9001) | HTML 4.0 | HTMLsymbol | ISOtech | left-pointing angle bracket (= bra) | |
rang | U+232A (9002) | HTML 4.0 | HTMLsymbol | ISOtech | right-pointing angle bracket (= ket) | |
loz | ◊ | U+25CA (9674) | HTML 4.0 | HTMLsymbol | ISOpub | lozenge Lozenge A lozenge , often referred to as a diamond, is a form of rhombus. The definition of lozenge is not strictly fixed, and it is sometimes used simply as a synonym for rhombus. Most often, though, lozenge refers to a thin rhombus—a rhombus with acute angles of 45°... |
spades | ♠ | U+2660 (9824) | HTML 4.0 | HTMLsymbol | ISOpub | black spade suit Suit (cards) In playing cards, a suit is one of several categories into which the cards of a deck are divided. Most often, each card bears one of several symbols showing to which suit it belongs; the suit may alternatively or in addition be indicated by the color printed on the card... |
clubs | ♣ | U+2663 (9827) | HTML 4.0 | HTMLsymbol | ISOpub | black club suit Suit (cards) In playing cards, a suit is one of several categories into which the cards of a deck are divided. Most often, each card bears one of several symbols showing to which suit it belongs; the suit may alternatively or in addition be indicated by the color printed on the card... (= shamrock) |
hearts | ♥ | U+2665 (9829) | HTML 4.0 | HTMLsymbol | ISOpub | black heart suit Suit (cards) In playing cards, a suit is one of several categories into which the cards of a deck are divided. Most often, each card bears one of several symbols showing to which suit it belongs; the suit may alternatively or in addition be indicated by the color printed on the card... (= valentine) |
diams | ♦ | U+2666 (9830) | HTML 4.0 | HTMLsymbol | ISOpub | black diamond suit Suit (cards) In playing cards, a suit is one of several categories into which the cards of a deck are divided. Most often, each card bears one of several symbols showing to which suit it belongs; the suit may alternatively or in addition be indicated by the color printed on the card... |
Notes:
Entities representing special characters in XHTML
The XHTMLXHTML
XHTML is a family of XML markup languages that mirror or extend versions of the widely-used Hypertext Markup Language , the language in which web pages are written....
DTD
Document Type Definition
Document Type Definition is a set of markup declarations that define a document type for SGML-family markup languages...
s explicitly declare 253 entities (including the 5 predefined entities of XML 1.0) whose expansion is a single character, which can therefore be informally referred to as "character entities". These (with the exception of the
'
entity) have the same names and represent the same characters as the 252 character entities in HTML. Also, by virtue of being XMLXML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....
, XHTML documents may reference the predefined
'
entity, which is not one of the 252 character entities in HTML. Additional entities of any size may be defined on a per-document basis. However, the usability of entity references in XHTML is affected by how the document is being processed:- If the document is read by a conforming HTML processor, then only the 252 HTML character entities can safely be used. The use of
'
or custom entity references may not be supported and may produce unpredictable results. - If the document is read by an XML parser that does not or cannot read external entities, then only the five built-in XML character entities (see above) can safely be used, although other entities may be used if they are declared in the internal DTD subset.
- If the document is read by an XML parser that does read external entities, then the five built-in XML character entities can safely be used. The other 248 HTML character entities can be used as long as the XHTML DTD is accessible to the parser at the time the document is read. Other entities may also be used if they are declared in the internal DTD subset.
Because of the special
'
case mentioned above, only "
, &
, <
, and >
will work in all processing situations.See also
- Character encodings in HTMLCharacter encodings in HTMLHTML has been in use since 1991, but HTML 4.0 was the first standardized version where international characters were given reasonably complete treatment...
- HTML decimal character renderingHTML decimal character renderingA numeric character reference in HTML refers to a character by its Universal Character Set/Unicode code point, and uses the formatorwhere nnnn is the code point in decimal form, and hhhh is the code point in hexadecimal form. The x must be lowercase in XML documents. The nnnn or hhhh may be any...
- SGML entitySGML entityIn the Standard Generalized Markup Language , an entity is a primitive data type, which associates a string with either a unique alias or an SGML reserved word . Entities are foundational to the organizational structure and definition of SGML documents...
External links
- Character entity references in HTML 4 at the W3C
- Multilanguage special character entity list - List of special characters, entities and their names.
- HTML Entities Encoder/Decoder - HTML Entities Encoder/Decoder.
- HTML entities quick reference table