ARMSCII
Encyclopedia
ARMSCII or ArmSCII is a set of obsolete single-byte character encoding
s for the Armenian alphabet
defined by Armenia
n national standard 166-9. ArmSCII is an acronym for Armenian
Standard Code for Information Interchange, similar to ASCII
for the American standard. It has been superseded by the Unicode
standard.
However these encodings are not widely used because the standard was published one year after the publication of international standard ISO 10585 that defined another 7-bit encoding, from which the encoding and mapping to the UCS (Universal Coded Character Set (ISO/IEC 10646) and Unicode
standards) were also derived a few years after, and there was a lack of support in the computer industry for adding ArmSCII.
for proper interchange of Armenian text for web browser
s and email
, since most modern computers do not support ARMSCII by default.
The following three main variants are defined:
Note that each ArmSCII encoding also has several minor variants, depending on the revision of the related Armenian standard (which was not made official before 1997, and was defined informally before that; this has caused various confusions and the mappings described below are just best practices according to the latest 1997 revision of the Armenian standard), that may change the exact mapping and usage of a few punctuation characters and symbols.
None of the ArmSCII encodings have reached international approval (unlike the ISO 10585 standard, despite of the critics sent by the official Armenian standard body to ISO/DIS JTC 1/SC 2/WG 2, working on single byte coded character sets) because all international efforts have been made since then to work with the UCS (in Unicode and ISO 10646).
ArmSCII-8 is intended for use on Unix and Windows systems, and for information interchange on the WWW and by email
. However Microsoft wanted users to use Unicode and not introduce a plethora of new code pages, so it is not supported natively on Windows. It just consists in remapping ArmSCII-7 in the higher range above the standard US ASCII range.
ArmSCII-8A is intended for use on DOS and Mac systems. It is a rearrangement of ArmSCII-8, to work with existing DOS and Mac code that reserve a range of code values for characters not intended for text but for presentation layout, using modified fonts; it is however considered as a "hack" of the code pages over which it is applied, as neither DOS (or Windows in the "OEM" compatibility code page used by the text-only console) nor MacOS has ever supported this encoding natively, notably in their filesystem (but this is also true for the now deprecated ISO 10585 standard). However, this encoding cannot map all the punctuation characters normally needed for Armenian, so the missing characters must be approximated using fallbacks to ASCII punctuation (some Armenian fonts may display these ASCII punctuation using the rendering intended for the Armenian characters that are mapped to them by these fallbacks).
In this table, code value 21 is the eternity sign, which has no designated codepoint in Unicode. Some mappings incorrectly claim that it has a codepoint of U+0530. This is incorrect, as that codepoint has not been allocated.
Code value 20 is the regular SPACE character, code values 00–1F and 7F are not assigned to characters by AST 34.005, though they may be the same as the ASCII control characters that are located in those positions.
Code value 22 is used to encode the Armenian ligature ew (և). In some variants it encodes the section sign (§) instead. It is strongly suggested to encode this ligature with the normal Armenian ech (yech) and yiwn (vyun) small letters pair as various software or fonts will render it differently depending on the version of ArmsCII-7 they are assuming, and let the renderer generate the ligature.
Code value 7F may be used sometimes as a substitution for the non-breaking space.
Note that the characters encoded at code values 2D and 7E (Armenian hyphen and apostrophe) may not be visible with all fonts supporting Armenian.
This table is simply remapped to higher codes by simple offset in ArmSCII-8 (below).
In this table, code value 20 is reserved for the regular SPACE character, code value A0 is reserved for the non-breaking space, and code value A1 is assigned to the eternity sign, which currently has no designated code point in Unicode. Some mappings incorrectly claim that it has a code point of U+0530. This is incorrect, as that code point has not been allocated.
Code values 00–1F, and 7F–9F are not assigned to characters by AST 34.002, though they may be the same as the ISO-8859-1 control characters that are located in those positions.
The code value A2 is used to encode the Armenian ligature ew (և). In some variants it encodes the section sign (§) instead. Some Armenian fonts display this ligature at the position of the ASCII ampersand symbol, but it is strongly suggested to encode the ligature using the two standard Armenian small letters that compose it.
The code value FF may be filled with the Armenian small letter modifier apostrophe (but it has no mapping in Unicode, and shown here using the ASCII apostrophe instead, for correct rendering with Unicode fonts, it is suggested that the small letter modifier be represented using code value FE with ligature control to change its position because it only occurs after a small Armenian letter), and the Armenian apostrophe at encoded at FE occurs only after a capital Armenian letter. So most implementations do not encode anything at code value FF.
This standard is the only one that makes an apparent distinction for the "mirrored" Armenian parentheses, because it was created by simply rempping the ArmsCII-7 standard. However, many documents will not consider this as a productive distinction, and the usual ASCII-based parenthesis punctuation are most commonly used instead of the ArmsCII-7 based mirrored parentheses, just because Armenian keyboards and editors using ArmsCII-8 generated the lower ASCII codes (whose usage is just swapped in classical Armenian). Also the duplication of the ASCII comma at code value AB is also the result of the simple remapping of ArmSCII-7, so there's no difference with the ASCII comma that most ArmSCII-8 documents are using.
Note that the characters encoded at code values AD and FE (Armenian hyphen and apostrophe) may not be visible with all fonts supporting Armenian.
In this table, code value 20 is the regular SPACE character, and code value DC is the eternity sign, which has no designated codepoint in Unicode. Some mappings incorrectly claim that it has a codepoint of U+0530. This is incorrect, as that codepoint has not been allocated.
Code values 00–1F, 7F, and B0–DB are not assigned to characters by AST 34.002, though they may be the same as those used in a legacy DOS/OEM codepage 437 (box drawing characters) or Macintosh Roman.
Note that the characters encoded at code values DD and FE (Armenian hyphen and apostrophe) may not be visible with all fonts supporting Armenian.
|-style="font-size:80%"
! x0 || x1 || x2 || x3 || x4 || x5 || x6 || x7 || x8 || x9 || xA || xB || xC || xD || xE || xF
|-
!style="font-size:80%"| 0x
|colspan="16" rowspan="2" style="background:#CCCCCC"| not used
|-
!style="font-size:80%"| 1x
|-
!style="font-size:80%"| 2x
|style="font-size:50%;background:#FFCCCC"| SP
| Ա || Բ || Գ || Դ || Ե || Զ || Է
| Ը || Թ || Ժ || Ի || Լ || Խ || Ծ || Կ
|-
!style="font-size:80%"| 3x
| Հ || Ձ || Ղ || Ճ || Մ || Յ || Ն || Շ
| Ո || Չ || Պ || Ջ || Ռ || Ս || Վ || Տ
|-
!style="font-size:80%"| 4x
| Ր || Ց || Ւ || Փ || Ք || Օ || Ֆ
|style="background:#CCCCCC"|
| ՝ || ՚ || ֊
|style="background:#CCCCCC"|
| ։ || , || ՞ || ՟
|-
!style="font-size:80%"| 5x
|style="background:#CCCCCC"|
| ա || բ || գ || դ || ե || զ || է
| ը || թ || ժ || ի || լ || խ || ծ || կ
|-
!style="font-size:80%"| 6x
| հ || ձ || ղ || ճ || մ || յ || ն || շ
| ո || չ || պ || ջ || ռ || ս || վ || տ
|-
!style="font-size:80%"| 7x
| ր || ց || ւ || փ || ք || օ || ֆ
|style="background:#CCCCCC"|
| ― || ‐ || ″
|style="background:#CCCCCC"|
| · || ՛ || ՜
|style="background:#CCCCCC"|
|}
Character encoding
A character encoding system consists of a code that pairs each character from a given repertoire with something else, such as a sequence of natural numbers, octets or electrical pulses, in order to facilitate the transmission of data through telecommunication networks or storage of text in...
s for the Armenian alphabet
Armenian alphabet
The Armenian alphabet is an alphabet that has been used to write the Armenian language since the year 405 or 406. It was devised by Saint Mesrop Mashtots, an Armenian linguist and ecclesiastical leader, and contained originally 36 letters. Two more letters, օ and ֆ, were added in the Middle Ages...
defined by Armenia
Armenia
Armenia , officially the Republic of Armenia , is a landlocked mountainous country in the Caucasus region of Eurasia...
n national standard 166-9. ArmSCII is an acronym for Armenian
Armenian alphabet
The Armenian alphabet is an alphabet that has been used to write the Armenian language since the year 405 or 406. It was devised by Saint Mesrop Mashtots, an Armenian linguist and ecclesiastical leader, and contained originally 36 letters. Two more letters, օ and ֆ, were added in the Middle Ages...
Standard Code for Information Interchange, similar to ASCII
ASCII
The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...
for the American standard. It has been superseded by the Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...
standard.
However these encodings are not widely used because the standard was published one year after the publication of international standard ISO 10585 that defined another 7-bit encoding, from which the encoding and mapping to the UCS (Universal Coded Character Set (ISO/IEC 10646) and Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...
standards) were also derived a few years after, and there was a lack of support in the computer industry for adding ArmSCII.
The encodings defined in the ArmSCII standard
Very few systems support these encodings. Windows does not support them for example. It is usually better to use UnicodeUnicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...
for proper interchange of Armenian text for web browser
Web browser
A web browser is a software application for retrieving, presenting, and traversing information resources on the World Wide Web. An information resource is identified by a Uniform Resource Identifier and may be a web page, image, video, or other piece of content...
s and email
Email
Electronic mail, commonly known as email or e-mail, is a method of exchanging digital messages from an author to one or more recipients. Modern email operates across the Internet or other computer networks. Some early email systems required that the author and the recipient both be online at the...
, since most modern computers do not support ARMSCII by default.
The following three main variants are defined:
- ArmSCII-7 defined in AST 34.005 is an 7-bit encoding, not containing Latin characters.
- ArmSCII-8 defined in AST 34.002 is an 8 bit encoding and a superset of ASCIIASCIIThe American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...
. - ArmSCII-8A defined in AST 34.002 is an alternate 8 bit encoding and also a superset of ASCIIASCIIThe American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...
.
Note that each ArmSCII encoding also has several minor variants, depending on the revision of the related Armenian standard (which was not made official before 1997, and was defined informally before that; this has caused various confusions and the mappings described below are just best practices according to the latest 1997 revision of the Armenian standard), that may change the exact mapping and usage of a few punctuation characters and symbols.
None of the ArmSCII encodings have reached international approval (unlike the ISO 10585 standard, despite of the critics sent by the official Armenian standard body to ISO/DIS JTC 1/SC 2/WG 2, working on single byte coded character sets) because all international efforts have been made since then to work with the UCS (in Unicode and ISO 10646).
ArmSCII-8 is intended for use on Unix and Windows systems, and for information interchange on the WWW and by email
Email
Electronic mail, commonly known as email or e-mail, is a method of exchanging digital messages from an author to one or more recipients. Modern email operates across the Internet or other computer networks. Some early email systems required that the author and the recipient both be online at the...
. However Microsoft wanted users to use Unicode and not introduce a plethora of new code pages, so it is not supported natively on Windows. It just consists in remapping ArmSCII-7 in the higher range above the standard US ASCII range.
ArmSCII-8A is intended for use on DOS and Mac systems. It is a rearrangement of ArmSCII-8, to work with existing DOS and Mac code that reserve a range of code values for characters not intended for text but for presentation layout, using modified fonts; it is however considered as a "hack" of the code pages over which it is applied, as neither DOS (or Windows in the "OEM" compatibility code page used by the text-only console) nor MacOS has ever supported this encoding natively, notably in their filesystem (but this is also true for the now deprecated ISO 10585 standard). However, this encoding cannot map all the punctuation characters normally needed for Armenian, so the missing characters must be approximated using fallbacks to ASCII punctuation (some Armenian fonts may display these ASCII punctuation using the rendering intended for the Armenian characters that are mapped to them by these fallbacks).
ArmSCII-7
AST 34.005:1997 (ArmSCII-7) 7-bit coded character set for Armenian. |
||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
x0 | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | xA | xB | xC | xD | xE | xF | |
0x | unused | |||||||||||||||
1x | ||||||||||||||||
2x | SP Space (punctuation) In writing, a space is a blank area devoid of content, serving to separate words, letters, numbers, and punctuation. Conventions for interword and intersentence spaces vary among languages, and in some cases the spacing rules are quite complex.... |
և / § | ։ | ) | ( | » | « | ― | · | ՝ | , | ‐ | ֊ | ... | ՜ | |
3x | ՛ | ՞ | Ա | ա | Բ | բ | Գ | գ | Դ | դ | Ե | ե | Զ | զ | Է | է |
4x | Ը | ը | Թ | թ | Ժ | ժ | Ի | ի | Լ | լ | Խ | խ | Ծ | ծ | Կ | կ |
5x | Հ | հ | Ձ | ձ | Ղ | ղ | Ճ | ճ | Մ | մ | Յ | յ | Ն | ն | Շ | շ |
6x | Ո | ո | Չ | չ | Պ | պ | Ջ | ջ | Ռ | ռ | Ս | ս | Վ | վ | Տ | տ |
7x | Ր | ր | Ց | ց | Ւ | ւ | Փ | փ | Ք | ք | Օ | օ | Ֆ | ֆ | ՚ |
In this table, code value 21 is the eternity sign, which has no designated codepoint in Unicode. Some mappings incorrectly claim that it has a codepoint of U+0530. This is incorrect, as that codepoint has not been allocated.
Code value 20 is the regular SPACE character, code values 00–1F and 7F are not assigned to characters by AST 34.005, though they may be the same as the ASCII control characters that are located in those positions.
Code value 22 is used to encode the Armenian ligature ew (և). In some variants it encodes the section sign (§) instead. It is strongly suggested to encode this ligature with the normal Armenian ech (yech) and yiwn (vyun) small letters pair as various software or fonts will render it differently depending on the version of ArmsCII-7 they are assuming, and let the renderer generate the ligature.
Code value 7F may be used sometimes as a substitution for the non-breaking space.
Note that the characters encoded at code values 2D and 7E (Armenian hyphen and apostrophe) may not be visible with all fonts supporting Armenian.
This table is simply remapped to higher codes by simple offset in ArmSCII-8 (below).
ArmSCII-8
AST 34.002:1997 (ArmSCII-8) 8-bit coded character set for Armenian. |
||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
x0 | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | xA | xB | xC | xD | xE | xF | |
0x | unused | |||||||||||||||
1x | ||||||||||||||||
2x | SP Space (punctuation) In writing, a space is a blank area devoid of content, serving to separate words, letters, numbers, and punctuation. Conventions for interword and intersentence spaces vary among languages, and in some cases the spacing rules are quite complex.... |
! Exclamation mark The exclamation mark, exclamation point, or bang, or "dembanger" is a punctuation mark usually used after an interjection or exclamation to indicate strong feelings or high volume , and often marks the end of a sentence. Example: “Watch out!” The character is encoded in Unicode at... |
" | # Number sign Number sign is a name for the symbol #, which is used for a variety of purposes including, in some countries, the designation of a number... |
$ Dollar sign The dollar or peso sign is a symbol primarily used to indicate the various peso and dollar units of currency around the world.- Origin :... |
% | & Ampersand An ampersand is a logogram representing the conjunction word "and". The symbol is a ligature of the letters in et, Latin for "and".-Etymology:... |
' ' The ' symbol is the apostrophe punctuation mark.The ' symbol may also refer to:*Single quotation mark, ', ‘, or ’*Ejective consonant or modifier letter apostrophe, *[[ʻOkina|Okina]], *Modifier letter right half ring, ʾ... |
( Bracket Brackets are tall punctuation marks used in matched pairs within text, to set apart or interject other text. In the United States, "bracket" usually refers specifically to the "square" or "box" type.-List of types:... |
) Bracket Brackets are tall punctuation marks used in matched pairs within text, to set apart or interject other text. In the United States, "bracket" usually refers specifically to the "square" or "box" type.-List of types:... |
* Asterisk An asterisk is a typographical symbol or glyph. It is so called because it resembles a conventional image of a star. Computer scientists and mathematicians often pronounce it as star... |
+ | , Comma (punctuation) The comma is a punctuation mark. It has the same shape as an apostrophe or single closing quotation mark in many typefaces, but it differs from them in being placed on the baseline of the text. Some typefaces render it as a small line, slightly curved or straight but inclined from the vertical, or... |
- Hyphen The hyphen is a punctuation mark used to join words and to separate syllables of a single word. The use of hyphens is called hyphenation. The hyphen should not be confused with dashes , which are longer and have different uses, or with the minus sign which is also longer... |
. Full stop A full stop is the punctuation mark commonly placed at the end of sentences. In American English, the term used for this punctuation is period. In the 21st century, it is often also called a dot by young people... |
/ Slash (punctuation) The slash is a sign used as a punctuation mark and for various other purposes. It is now often called a forward slash , and many other alternative names.-History:... |
3x | : Colon (punctuation) The colon is a punctuation mark consisting of two equally sized dots centered on the same vertical line.-Usage:A colon informs the reader that what follows the mark proves, explains, or lists elements of what preceded the mark.... |
; Semicolon The semicolon is a punctuation mark with several uses. The Italian printer Aldus Manutius the Elder established the practice of using the semicolon to separate words of opposed meaning and to indicate interdependent statements. "The first printed semicolon was the work of ... Aldus Manutius"... |
< | = | > | ? Question mark The question mark , is a punctuation mark that replaces the full stop at the end of an interrogative sentence in English and many other languages. The question mark is not used for indirect questions... |
||||||||||
4x | @ | A A A is the first letter and a vowel in the basic modern Latin alphabet. It is similar to the Ancient Greek letter Alpha, from which it derives.- Origins :... |
B B B is the second letter in the basic modern Latin alphabet. It is used to represent a variety of bilabial sounds , most commonly a voiced bilabial plosive.-History:... |
C C Ĉ or ĉ is a consonant in Esperanto orthography, representing the sound .Esperanto orthography uses a diacritic for all four of its postalveolar consonants, as do the Latin-based Slavic alphabets... |
D D D is the fourth letter in the basic modern Latin alphabet.- History :The Semitic letter Dâlet may have developed from the logogram for a fish or a door. There are various Egyptian hieroglyphs that might have inspired this. In Semitic, Ancient Greek, and Latin, the letter represented ; in the... |
E E E is the fifth letter and a vowel in the basic modern Latin alphabet. It is the most commonly used letter in the Czech, Danish, Dutch, English, French, German, Hungarian, Latin, Norwegian, Spanish, and Swedish languages.-History:... |
F F F is the sixth letter in the basic modern Latin alphabet.-History:The origin of ⟨f⟩ is the Semitic letter vâv that represented a sound like or . Graphically, it originally probably depicted either a hook or a club... |
G G G is the seventh letter in the basic modern Latin alphabet.-History:The letter 'G' was introduced in the Old Latin period as a variant of ⟨c⟩ to distinguish voiced, from voiceless, . The recorded originator of ⟨g⟩ is freedman Spurius Carvilius Ruga, the first Roman to open a fee-paying school,... |
H H H .) is the eighth letter in the basic modern Latin alphabet.-History:The Semitic letter ⟨ח⟩ most likely represented the voiceless pharyngeal fricative . The form of the letter probably stood for a fence or posts.... |
I I I is the ninth letter and a vowel in the basic modern Latin alphabet.-History:In Semitic, the letter may have originated in a hieroglyph for an arm that represented a voiced pharyngeal fricative in Egyptian, but was reassigned to by Semites, because their word for "arm" began with that sound... |
J J Ĵ or ĵ is a letter in Esperanto orthography representing the sound .While Esperanto orthography uses a diacritic for its four postalveolar consonants, as do the Latin-based Slavic alphabets, the base letters are Romano-Germanic... |
K K K is the eleventh letter of the English and basic modern Latin alphabet.-History and usage:In English, the letter K usually represents the voiceless velar plosive; this sound is also transcribed by in the International Phonetic Alphabet and X-SAMPA.... |
L L Ł or ł, described in English as L with stroke, is a letter of the Polish, Kashubian, Sorbian, Łacinka , Łatynka , Wilamowicean, Navajo, Dene Suline, Inupiaq, Zuni, Hupa, and Dogrib alphabets, several proposed alphabets for the Venetian language, and the ISO 11940 romanization of the Thai alphabet... |
M M M is the thirteenth letter of the basic modern Latin alphabet.-History:The letter M is derived from the Phoenician Mem, via the Greek Mu . Semitic Mem probably originally pictured water... |
N N N is the fourteenth letter in the basic modern Latin alphabet.- History of the forms :One of the most common hieroglyphs, snake, was used in Egyptian writing to stand for a sound like English ⟨J⟩, because the Egyptian word for "snake" was djet... |
O O O is the fifteenth letter and a vowel in the basic modern Latin alphabet.The letter was derived from the Semitic `Ayin , which represented a consonant, probably , the sound represented by the Arabic letter ع called `Ayn. This Semitic letter in its original form seems to have been inspired by a... |
5x | P P P is the sixteenth letter of the basic modern Latin alphabet.-Usage:In English and most other European languages, P is a voiceless bilabial plosive. Both initial and final Ps can be combined with many other discrete consonants in English words... |
Q Q Q is the seventeenth letter of the basic modern Latin alphabet.- History :The Semitic sound value of Qôp was , a sound common to Semitic languages, but not found in English or most Indo-European ones... |
R R R is the eighteenth letter of the basic modern Latin alphabet.-History:The original Semitic letter may have been inspired by an Egyptian hieroglyph for tp, "head". It was used for by Semites because in their language, the word for "head" was rêš . It developed into Greek Ρ and Latin R... |
S S S is the nineteenth letter in the ISO basic Latin alphabet.-History: Semitic Šîn represented a voiceless postalveolar fricative . Greek did not have this sound, so the Greek sigma came to represent... |
T T T is the 20th letter in the basic modern Latin alphabet. It is the most commonly used consonant and the second most common letter in the English language.- History :Taw was the last letter of the Western Semitic and Hebrew alphabets... |
U U U is the twenty-first letter and a vowel in the basic modern Latin alphabet.-History:The letter U ultimately comes from the Semitic letter Waw by way of the letter Y. See the letter Y for details.... |
V V V is the twenty-second letter in the basic modern Latin alphabet.-Letter:The letter V comes from the Semitic letter Waw, as do the modern letters F, U, W, and Y. See F for details.... |
W W W is the 23rd letter in the basic modern Latin alphabet.In other Germanic languages, including German, its pronunciation is similar or identical to that of English V... |
X X X is the twenty-fourth letter in the basic modern Latin alphabet.-Uses:In mathematics, x is commonly used as the name for an independent variable or unknown value. The usage of x to represent an independent or unknown variable can be traced back to the Arabic word šay شيء = “thing,” used in Arabic... |
Y Y Y is the twenty-fifth letter in the basic modern Latin alphabet and represents either a vowel or a consonant in English.-Name:In Latin, Y was named Y Graeca "Greek Y". This was pronounced as I Graeca "Greek I", since Latin speakers had trouble pronouncing , which was not a native sound... |
Z Z Z is the twenty-sixth and final letter of the basic modern Latin alphabet.-Name and pronunciation:In most dialects of English, the letter's name is zed , reflecting its derivation from the Greek zeta but in American English, its name is zee , deriving from a late 17th century English dialectal... |
|
\ Backslash The backslash is a typographical mark used mainly in computing. It was first introduced to computers in 1960 by Bob Bemer. Sometimes called a reverse solidus or a slosh, it is the mirror image of the common slash.... |
|
^ Circumflex The circumflex is a diacritic used in the written forms of many languages, and is also commonly used in various romanization and transcription schemes. It received its English name from Latin circumflexus —a translation of the Greek περισπωμένη... |
Underscore The underscore [ _ ] is a character that originally appeared on the typewriter and was primarily used to underline words... |
6x | ` | a A A is the first letter and a vowel in the basic modern Latin alphabet. It is similar to the Ancient Greek letter Alpha, from which it derives.- Origins :... |
b B B is the second letter in the basic modern Latin alphabet. It is used to represent a variety of bilabial sounds , most commonly a voiced bilabial plosive.-History:... |
c C Ĉ or ĉ is a consonant in Esperanto orthography, representing the sound .Esperanto orthography uses a diacritic for all four of its postalveolar consonants, as do the Latin-based Slavic alphabets... |
d D D is the fourth letter in the basic modern Latin alphabet.- History :The Semitic letter Dâlet may have developed from the logogram for a fish or a door. There are various Egyptian hieroglyphs that might have inspired this. In Semitic, Ancient Greek, and Latin, the letter represented ; in the... |
e E E is the fifth letter and a vowel in the basic modern Latin alphabet. It is the most commonly used letter in the Czech, Danish, Dutch, English, French, German, Hungarian, Latin, Norwegian, Spanish, and Swedish languages.-History:... |
f F F is the sixth letter in the basic modern Latin alphabet.-History:The origin of ⟨f⟩ is the Semitic letter vâv that represented a sound like or . Graphically, it originally probably depicted either a hook or a club... |
g G G is the seventh letter in the basic modern Latin alphabet.-History:The letter 'G' was introduced in the Old Latin period as a variant of ⟨c⟩ to distinguish voiced, from voiceless, . The recorded originator of ⟨g⟩ is freedman Spurius Carvilius Ruga, the first Roman to open a fee-paying school,... |
h H H .) is the eighth letter in the basic modern Latin alphabet.-History:The Semitic letter ⟨ח⟩ most likely represented the voiceless pharyngeal fricative . The form of the letter probably stood for a fence or posts.... |
i I I is the ninth letter and a vowel in the basic modern Latin alphabet.-History:In Semitic, the letter may have originated in a hieroglyph for an arm that represented a voiced pharyngeal fricative in Egyptian, but was reassigned to by Semites, because their word for "arm" began with that sound... |
j J Ĵ or ĵ is a letter in Esperanto orthography representing the sound .While Esperanto orthography uses a diacritic for its four postalveolar consonants, as do the Latin-based Slavic alphabets, the base letters are Romano-Germanic... |
k K K is the eleventh letter of the English and basic modern Latin alphabet.-History and usage:In English, the letter K usually represents the voiceless velar plosive; this sound is also transcribed by in the International Phonetic Alphabet and X-SAMPA.... |
l L Ł or ł, described in English as L with stroke, is a letter of the Polish, Kashubian, Sorbian, Łacinka , Łatynka , Wilamowicean, Navajo, Dene Suline, Inupiaq, Zuni, Hupa, and Dogrib alphabets, several proposed alphabets for the Venetian language, and the ISO 11940 romanization of the Thai alphabet... |
m M M is the thirteenth letter of the basic modern Latin alphabet.-History:The letter M is derived from the Phoenician Mem, via the Greek Mu . Semitic Mem probably originally pictured water... |
n N N is the fourteenth letter in the basic modern Latin alphabet.- History of the forms :One of the most common hieroglyphs, snake, was used in Egyptian writing to stand for a sound like English ⟨J⟩, because the Egyptian word for "snake" was djet... |
o O O is the fifteenth letter and a vowel in the basic modern Latin alphabet.The letter was derived from the Semitic `Ayin , which represented a consonant, probably , the sound represented by the Arabic letter ع called `Ayn. This Semitic letter in its original form seems to have been inspired by a... |
7x | p P P is the sixteenth letter of the basic modern Latin alphabet.-Usage:In English and most other European languages, P is a voiceless bilabial plosive. Both initial and final Ps can be combined with many other discrete consonants in English words... |
q Q Q is the seventeenth letter of the basic modern Latin alphabet.- History :The Semitic sound value of Qôp was , a sound common to Semitic languages, but not found in English or most Indo-European ones... |
r R R is the eighteenth letter of the basic modern Latin alphabet.-History:The original Semitic letter may have been inspired by an Egyptian hieroglyph for tp, "head". It was used for by Semites because in their language, the word for "head" was rêš . It developed into Greek Ρ and Latin R... |
s S S is the nineteenth letter in the ISO basic Latin alphabet.-History: Semitic Šîn represented a voiceless postalveolar fricative . Greek did not have this sound, so the Greek sigma came to represent... |
t T T is the 20th letter in the basic modern Latin alphabet. It is the most commonly used consonant and the second most common letter in the English language.- History :Taw was the last letter of the Western Semitic and Hebrew alphabets... |
u U U is the twenty-first letter and a vowel in the basic modern Latin alphabet.-History:The letter U ultimately comes from the Semitic letter Waw by way of the letter Y. See the letter Y for details.... |
v V V is the twenty-second letter in the basic modern Latin alphabet.-Letter:The letter V comes from the Semitic letter Waw, as do the modern letters F, U, W, and Y. See F for details.... |
w W W is the 23rd letter in the basic modern Latin alphabet.In other Germanic languages, including German, its pronunciation is similar or identical to that of English V... |
x X X is the twenty-fourth letter in the basic modern Latin alphabet.-Uses:In mathematics, x is commonly used as the name for an independent variable or unknown value. The usage of x to represent an independent or unknown variable can be traced back to the Arabic word šay شيء = “thing,” used in Arabic... |
y Y Y is the twenty-fifth letter in the basic modern Latin alphabet and represents either a vowel or a consonant in English.-Name:In Latin, Y was named Y Graeca "Greek Y". This was pronounced as I Graeca "Greek I", since Latin speakers had trouble pronouncing , which was not a native sound... |
z Z Z is the twenty-sixth and final letter of the basic modern Latin alphabet.-Name and pronunciation:In most dialects of English, the letter's name is zed , reflecting its derivation from the Greek zeta but in American English, its name is zee , deriving from a late 17th century English dialectal... |
|
Vertical bar The vertical bar is a character with various uses in mathematics, where it can be used to represent absolute value, among others; in computing and programming and in general typography, as a divider not unlike the interpunct... |
|
~ Tilde The tilde is a grapheme with several uses. The name of the character comes from Portuguese and Spanish, from the Latin titulus meaning "title" or "superscription", though the term "tilde" has evolved and now has a different meaning in linguistics.... |
|
8x | unused | |||||||||||||||
9x | ||||||||||||||||
Ax | NB SP Non-breaking space In computer-based text processing and digital typesetting, a non-breaking space or no-break space is a variant of the space character that prevents an automatic line break at its position. In certain formats , it also prevents the “collapsing” of multiple consecutive whitespace characters into a... |
և / § | ։ | ) | ( | » | « | ― | · | ՝ | , | ‐ | ֊ | ... | ՜ | |
Bx | ՛ | ՞ | Ա | ա | Բ | բ | Գ | գ | Դ | դ | Ե | ե | Զ | զ | Է | է |
Cx | Ը | ը | Թ | թ | Ժ | ժ | Ի | ի | Լ | լ | Խ | խ | Ծ | ծ | Կ | կ |
Dx | Հ | հ | Ձ | ձ | Ղ | ղ | Ճ | ճ | Մ | մ | Յ | յ | Ն | ն | Շ | շ |
Ex | Ո | ո | Չ | չ | Պ | պ | Ջ | ջ | Ռ | ռ | Ս | ս | Վ | վ | Տ | տ |
Fx | Ր | ր | Ց | ց | Ւ | ւ | Փ | փ | Ք | ք | Օ | օ | Ֆ | ֆ | ՚ |
In this table, code value 20 is reserved for the regular SPACE character, code value A0 is reserved for the non-breaking space, and code value A1 is assigned to the eternity sign, which currently has no designated code point in Unicode. Some mappings incorrectly claim that it has a code point of U+0530. This is incorrect, as that code point has not been allocated.
Code values 00–1F, and 7F–9F are not assigned to characters by AST 34.002, though they may be the same as the ISO-8859-1 control characters that are located in those positions.
The code value A2 is used to encode the Armenian ligature ew (և). In some variants it encodes the section sign (§) instead. Some Armenian fonts display this ligature at the position of the ASCII ampersand symbol, but it is strongly suggested to encode the ligature using the two standard Armenian small letters that compose it.
The code value FF may be filled with the Armenian small letter modifier apostrophe (but it has no mapping in Unicode, and shown here using the ASCII apostrophe instead, for correct rendering with Unicode fonts, it is suggested that the small letter modifier be represented using code value FE with ligature control to change its position because it only occurs after a small Armenian letter), and the Armenian apostrophe at encoded at FE occurs only after a capital Armenian letter. So most implementations do not encode anything at code value FF.
This standard is the only one that makes an apparent distinction for the "mirrored" Armenian parentheses, because it was created by simply rempping the ArmsCII-7 standard. However, many documents will not consider this as a productive distinction, and the usual ASCII-based parenthesis punctuation are most commonly used instead of the ArmsCII-7 based mirrored parentheses, just because Armenian keyboards and editors using ArmsCII-8 generated the lower ASCII codes (whose usage is just swapped in classical Armenian). Also the duplication of the ASCII comma at code value AB is also the result of the simple remapping of ArmSCII-7, so there's no difference with the ASCII comma that most ArmSCII-8 documents are using.
Note that the characters encoded at code values AD and FE (Armenian hyphen and apostrophe) may not be visible with all fonts supporting Armenian.
ArmSCII-8A
AST 34.001:1997 (ArmSCII-8A) 8-bit coded character set for Armenian. |
||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
x0 | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | xA | xB | xC | xD | xE | xF | |
0x | unused | |||||||||||||||
1x | ||||||||||||||||
2x | SP Space (punctuation) In writing, a space is a blank area devoid of content, serving to separate words, letters, numbers, and punctuation. Conventions for interword and intersentence spaces vary among languages, and in some cases the spacing rules are quite complex.... |
! Exclamation mark The exclamation mark, exclamation point, or bang, or "dembanger" is a punctuation mark usually used after an interjection or exclamation to indicate strong feelings or high volume , and often marks the end of a sentence. Example: “Watch out!” The character is encoded in Unicode at... |
" | # Number sign Number sign is a name for the symbol #, which is used for a variety of purposes including, in some countries, the designation of a number... |
$ Dollar sign The dollar or peso sign is a symbol primarily used to indicate the various peso and dollar units of currency around the world.- Origin :... |
% | & Ampersand An ampersand is a logogram representing the conjunction word "and". The symbol is a ligature of the letters in et, Latin for "and".-Etymology:... |
' ' The ' symbol is the apostrophe punctuation mark.The ' symbol may also refer to:*Single quotation mark, ', ‘, or ’*Ejective consonant or modifier letter apostrophe, *[[ʻOkina|Okina]], *Modifier letter right half ring, ʾ... |
( Bracket Brackets are tall punctuation marks used in matched pairs within text, to set apart or interject other text. In the United States, "bracket" usually refers specifically to the "square" or "box" type.-List of types:... |
) Bracket Brackets are tall punctuation marks used in matched pairs within text, to set apart or interject other text. In the United States, "bracket" usually refers specifically to the "square" or "box" type.-List of types:... |
* Asterisk An asterisk is a typographical symbol or glyph. It is so called because it resembles a conventional image of a star. Computer scientists and mathematicians often pronounce it as star... |
+ | , Comma (punctuation) The comma is a punctuation mark. It has the same shape as an apostrophe or single closing quotation mark in many typefaces, but it differs from them in being placed on the baseline of the text. Some typefaces render it as a small line, slightly curved or straight but inclined from the vertical, or... |
- Hyphen The hyphen is a punctuation mark used to join words and to separate syllables of a single word. The use of hyphens is called hyphenation. The hyphen should not be confused with dashes , which are longer and have different uses, or with the minus sign which is also longer... |
. Full stop A full stop is the punctuation mark commonly placed at the end of sentences. In American English, the term used for this punctuation is period. In the 21st century, it is often also called a dot by young people... |
/ Slash (punctuation) The slash is a sign used as a punctuation mark and for various other purposes. It is now often called a forward slash , and many other alternative names.-History:... |
3x | : Colon (punctuation) The colon is a punctuation mark consisting of two equally sized dots centered on the same vertical line.-Usage:A colon informs the reader that what follows the mark proves, explains, or lists elements of what preceded the mark.... |
; Semicolon The semicolon is a punctuation mark with several uses. The Italian printer Aldus Manutius the Elder established the practice of using the semicolon to separate words of opposed meaning and to indicate interdependent statements. "The first printed semicolon was the work of ... Aldus Manutius"... |
< | = | > | ? Question mark The question mark , is a punctuation mark that replaces the full stop at the end of an interrogative sentence in English and many other languages. The question mark is not used for indirect questions... |
||||||||||
4x | @ | A A A is the first letter and a vowel in the basic modern Latin alphabet. It is similar to the Ancient Greek letter Alpha, from which it derives.- Origins :... |
B B B is the second letter in the basic modern Latin alphabet. It is used to represent a variety of bilabial sounds , most commonly a voiced bilabial plosive.-History:... |
C C Ĉ or ĉ is a consonant in Esperanto orthography, representing the sound .Esperanto orthography uses a diacritic for all four of its postalveolar consonants, as do the Latin-based Slavic alphabets... |
D D D is the fourth letter in the basic modern Latin alphabet.- History :The Semitic letter Dâlet may have developed from the logogram for a fish or a door. There are various Egyptian hieroglyphs that might have inspired this. In Semitic, Ancient Greek, and Latin, the letter represented ; in the... |
E E E is the fifth letter and a vowel in the basic modern Latin alphabet. It is the most commonly used letter in the Czech, Danish, Dutch, English, French, German, Hungarian, Latin, Norwegian, Spanish, and Swedish languages.-History:... |
F F F is the sixth letter in the basic modern Latin alphabet.-History:The origin of ⟨f⟩ is the Semitic letter vâv that represented a sound like or . Graphically, it originally probably depicted either a hook or a club... |
G G G is the seventh letter in the basic modern Latin alphabet.-History:The letter 'G' was introduced in the Old Latin period as a variant of ⟨c⟩ to distinguish voiced, from voiceless, . The recorded originator of ⟨g⟩ is freedman Spurius Carvilius Ruga, the first Roman to open a fee-paying school,... |
H H H .) is the eighth letter in the basic modern Latin alphabet.-History:The Semitic letter ⟨ח⟩ most likely represented the voiceless pharyngeal fricative . The form of the letter probably stood for a fence or posts.... |
I I I is the ninth letter and a vowel in the basic modern Latin alphabet.-History:In Semitic, the letter may have originated in a hieroglyph for an arm that represented a voiced pharyngeal fricative in Egyptian, but was reassigned to by Semites, because their word for "arm" began with that sound... |
J J Ĵ or ĵ is a letter in Esperanto orthography representing the sound .While Esperanto orthography uses a diacritic for its four postalveolar consonants, as do the Latin-based Slavic alphabets, the base letters are Romano-Germanic... |
K K K is the eleventh letter of the English and basic modern Latin alphabet.-History and usage:In English, the letter K usually represents the voiceless velar plosive; this sound is also transcribed by in the International Phonetic Alphabet and X-SAMPA.... |
L L Ł or ł, described in English as L with stroke, is a letter of the Polish, Kashubian, Sorbian, Łacinka , Łatynka , Wilamowicean, Navajo, Dene Suline, Inupiaq, Zuni, Hupa, and Dogrib alphabets, several proposed alphabets for the Venetian language, and the ISO 11940 romanization of the Thai alphabet... |
M M M is the thirteenth letter of the basic modern Latin alphabet.-History:The letter M is derived from the Phoenician Mem, via the Greek Mu . Semitic Mem probably originally pictured water... |
N N N is the fourteenth letter in the basic modern Latin alphabet.- History of the forms :One of the most common hieroglyphs, snake, was used in Egyptian writing to stand for a sound like English ⟨J⟩, because the Egyptian word for "snake" was djet... |
O O O is the fifteenth letter and a vowel in the basic modern Latin alphabet.The letter was derived from the Semitic `Ayin , which represented a consonant, probably , the sound represented by the Arabic letter ع called `Ayn. This Semitic letter in its original form seems to have been inspired by a... |
5x | P P P is the sixteenth letter of the basic modern Latin alphabet.-Usage:In English and most other European languages, P is a voiceless bilabial plosive. Both initial and final Ps can be combined with many other discrete consonants in English words... |
Q Q Q is the seventeenth letter of the basic modern Latin alphabet.- History :The Semitic sound value of Qôp was , a sound common to Semitic languages, but not found in English or most Indo-European ones... |
R R R is the eighteenth letter of the basic modern Latin alphabet.-History:The original Semitic letter may have been inspired by an Egyptian hieroglyph for tp, "head". It was used for by Semites because in their language, the word for "head" was rêš . It developed into Greek Ρ and Latin R... |
S S S is the nineteenth letter in the ISO basic Latin alphabet.-History: Semitic Šîn represented a voiceless postalveolar fricative . Greek did not have this sound, so the Greek sigma came to represent... |
T T T is the 20th letter in the basic modern Latin alphabet. It is the most commonly used consonant and the second most common letter in the English language.- History :Taw was the last letter of the Western Semitic and Hebrew alphabets... |
U U U is the twenty-first letter and a vowel in the basic modern Latin alphabet.-History:The letter U ultimately comes from the Semitic letter Waw by way of the letter Y. See the letter Y for details.... |
V V V is the twenty-second letter in the basic modern Latin alphabet.-Letter:The letter V comes from the Semitic letter Waw, as do the modern letters F, U, W, and Y. See F for details.... |
W W W is the 23rd letter in the basic modern Latin alphabet.In other Germanic languages, including German, its pronunciation is similar or identical to that of English V... |
X X X is the twenty-fourth letter in the basic modern Latin alphabet.-Uses:In mathematics, x is commonly used as the name for an independent variable or unknown value. The usage of x to represent an independent or unknown variable can be traced back to the Arabic word šay شيء = “thing,” used in Arabic... |
Y Y Y is the twenty-fifth letter in the basic modern Latin alphabet and represents either a vowel or a consonant in English.-Name:In Latin, Y was named Y Graeca "Greek Y". This was pronounced as I Graeca "Greek I", since Latin speakers had trouble pronouncing , which was not a native sound... |
Z Z Z is the twenty-sixth and final letter of the basic modern Latin alphabet.-Name and pronunciation:In most dialects of English, the letter's name is zed , reflecting its derivation from the Greek zeta but in American English, its name is zee , deriving from a late 17th century English dialectal... |
|
\ Backslash The backslash is a typographical mark used mainly in computing. It was first introduced to computers in 1960 by Bob Bemer. Sometimes called a reverse solidus or a slosh, it is the mirror image of the common slash.... |
|
^ Circumflex The circumflex is a diacritic used in the written forms of many languages, and is also commonly used in various romanization and transcription schemes. It received its English name from Latin circumflexus —a translation of the Greek περισπωμένη... |
Underscore The underscore [ _ ] is a character that originally appeared on the typewriter and was primarily used to underline words... |
6x | ` | a A A is the first letter and a vowel in the basic modern Latin alphabet. It is similar to the Ancient Greek letter Alpha, from which it derives.- Origins :... |
b B B is the second letter in the basic modern Latin alphabet. It is used to represent a variety of bilabial sounds , most commonly a voiced bilabial plosive.-History:... |
c C Ĉ or ĉ is a consonant in Esperanto orthography, representing the sound .Esperanto orthography uses a diacritic for all four of its postalveolar consonants, as do the Latin-based Slavic alphabets... |
d D D is the fourth letter in the basic modern Latin alphabet.- History :The Semitic letter Dâlet may have developed from the logogram for a fish or a door. There are various Egyptian hieroglyphs that might have inspired this. In Semitic, Ancient Greek, and Latin, the letter represented ; in the... |
e E E is the fifth letter and a vowel in the basic modern Latin alphabet. It is the most commonly used letter in the Czech, Danish, Dutch, English, French, German, Hungarian, Latin, Norwegian, Spanish, and Swedish languages.-History:... |
f F F is the sixth letter in the basic modern Latin alphabet.-History:The origin of ⟨f⟩ is the Semitic letter vâv that represented a sound like or . Graphically, it originally probably depicted either a hook or a club... |
g G G is the seventh letter in the basic modern Latin alphabet.-History:The letter 'G' was introduced in the Old Latin period as a variant of ⟨c⟩ to distinguish voiced, from voiceless, . The recorded originator of ⟨g⟩ is freedman Spurius Carvilius Ruga, the first Roman to open a fee-paying school,... |
h H H .) is the eighth letter in the basic modern Latin alphabet.-History:The Semitic letter ⟨ח⟩ most likely represented the voiceless pharyngeal fricative . The form of the letter probably stood for a fence or posts.... |
i I I is the ninth letter and a vowel in the basic modern Latin alphabet.-History:In Semitic, the letter may have originated in a hieroglyph for an arm that represented a voiced pharyngeal fricative in Egyptian, but was reassigned to by Semites, because their word for "arm" began with that sound... |
j J Ĵ or ĵ is a letter in Esperanto orthography representing the sound .While Esperanto orthography uses a diacritic for its four postalveolar consonants, as do the Latin-based Slavic alphabets, the base letters are Romano-Germanic... |
k K K is the eleventh letter of the English and basic modern Latin alphabet.-History and usage:In English, the letter K usually represents the voiceless velar plosive; this sound is also transcribed by in the International Phonetic Alphabet and X-SAMPA.... |
l L Ł or ł, described in English as L with stroke, is a letter of the Polish, Kashubian, Sorbian, Łacinka , Łatynka , Wilamowicean, Navajo, Dene Suline, Inupiaq, Zuni, Hupa, and Dogrib alphabets, several proposed alphabets for the Venetian language, and the ISO 11940 romanization of the Thai alphabet... |
m M M is the thirteenth letter of the basic modern Latin alphabet.-History:The letter M is derived from the Phoenician Mem, via the Greek Mu . Semitic Mem probably originally pictured water... |
n N N is the fourteenth letter in the basic modern Latin alphabet.- History of the forms :One of the most common hieroglyphs, snake, was used in Egyptian writing to stand for a sound like English ⟨J⟩, because the Egyptian word for "snake" was djet... |
o O O is the fifteenth letter and a vowel in the basic modern Latin alphabet.The letter was derived from the Semitic `Ayin , which represented a consonant, probably , the sound represented by the Arabic letter ع called `Ayn. This Semitic letter in its original form seems to have been inspired by a... |
7x | p P P is the sixteenth letter of the basic modern Latin alphabet.-Usage:In English and most other European languages, P is a voiceless bilabial plosive. Both initial and final Ps can be combined with many other discrete consonants in English words... |
q Q Q is the seventeenth letter of the basic modern Latin alphabet.- History :The Semitic sound value of Qôp was , a sound common to Semitic languages, but not found in English or most Indo-European ones... |
r R R is the eighteenth letter of the basic modern Latin alphabet.-History:The original Semitic letter may have been inspired by an Egyptian hieroglyph for tp, "head". It was used for by Semites because in their language, the word for "head" was rêš . It developed into Greek Ρ and Latin R... |
s S S is the nineteenth letter in the ISO basic Latin alphabet.-History: Semitic Šîn represented a voiceless postalveolar fricative . Greek did not have this sound, so the Greek sigma came to represent... |
t T T is the 20th letter in the basic modern Latin alphabet. It is the most commonly used consonant and the second most common letter in the English language.- History :Taw was the last letter of the Western Semitic and Hebrew alphabets... |
u U U is the twenty-first letter and a vowel in the basic modern Latin alphabet.-History:The letter U ultimately comes from the Semitic letter Waw by way of the letter Y. See the letter Y for details.... |
v V V is the twenty-second letter in the basic modern Latin alphabet.-Letter:The letter V comes from the Semitic letter Waw, as do the modern letters F, U, W, and Y. See F for details.... |
w W W is the 23rd letter in the basic modern Latin alphabet.In other Germanic languages, including German, its pronunciation is similar or identical to that of English V... |
x X X is the twenty-fourth letter in the basic modern Latin alphabet.-Uses:In mathematics, x is commonly used as the name for an independent variable or unknown value. The usage of x to represent an independent or unknown variable can be traced back to the Arabic word šay شيء = “thing,” used in Arabic... |
y Y Y is the twenty-fifth letter in the basic modern Latin alphabet and represents either a vowel or a consonant in English.-Name:In Latin, Y was named Y Graeca "Greek Y". This was pronounced as I Graeca "Greek I", since Latin speakers had trouble pronouncing , which was not a native sound... |
z Z Z is the twenty-sixth and final letter of the basic modern Latin alphabet.-Name and pronunciation:In most dialects of English, the letter's name is zed , reflecting its derivation from the Greek zeta but in American English, its name is zee , deriving from a late 17th century English dialectal... |
|
Vertical bar The vertical bar is a character with various uses in mathematics, where it can be used to represent absolute value, among others; in computing and programming and in general typography, as a divider not unlike the interpunct... |
|
~ Tilde The tilde is a grapheme with several uses. The name of the character comes from Portuguese and Spanish, from the Latin titulus meaning "title" or "superscription", though the term "tilde" has evolved and now has a different meaning in linguistics.... |
|
8x | Ա | ա | Բ | բ | Գ | գ | Դ | դ | Ե | ե | Զ | զ | Է | է | Ը | ը |
9x | Թ | թ | Ժ | ժ | Ի | ի | Լ | լ | Խ | խ | Ծ | ծ | Կ | կ | Հ | հ |
Ax | Ձ | ձ | Ղ | ղ | Ճ | ճ | Մ | մ | Յ | յ | Ն | ն | Շ | շ | « | » |
Bx | unused | |||||||||||||||
Cx | ||||||||||||||||
Dx | unused | ֊ | ... | ՞ | ||||||||||||
Ex | Ո | ո | Չ | չ | Պ | պ | Ջ | ջ | Ռ | ռ | Ս | ս | Վ | վ | Տ | տ |
Fx | Ր | ր | Ց | ց | Ւ | ւ | Փ | փ | Ք | ք | Օ | օ | Ֆ | ֆ | ՚ | NB SP Non-breaking space In computer-based text processing and digital typesetting, a non-breaking space or no-break space is a variant of the space character that prevents an automatic line break at its position. In certain formats , it also prevents the “collapsing” of multiple consecutive whitespace characters into a... |
In this table, code value 20 is the regular SPACE character, and code value DC is the eternity sign, which has no designated codepoint in Unicode. Some mappings incorrectly claim that it has a codepoint of U+0530. This is incorrect, as that codepoint has not been allocated.
Code values 00–1F, 7F, and B0–DB are not assigned to characters by AST 34.002, though they may be the same as those used in a legacy DOS/OEM codepage 437 (box drawing characters) or Macintosh Roman.
Note that the characters encoded at code values DD and FE (Armenian hyphen and apostrophe) may not be visible with all fonts supporting Armenian.
ISO 10585:1996
7-bit coded character set for Armenian.|-style="font-size:80%"
! x0 || x1 || x2 || x3 || x4 || x5 || x6 || x7 || x8 || x9 || xA || xB || xC || xD || xE || xF
|-
!style="font-size:80%"| 0x
|colspan="16" rowspan="2" style="background:#CCCCCC"| not used
|-
!style="font-size:80%"| 1x
|-
!style="font-size:80%"| 2x
|style="font-size:50%;background:#FFCCCC"| SP
Space (punctuation)
In writing, a space is a blank area devoid of content, serving to separate words, letters, numbers, and punctuation. Conventions for interword and intersentence spaces vary among languages, and in some cases the spacing rules are quite complex....
| Ա || Բ || Գ || Դ || Ե || Զ || Է
| Ը || Թ || Ժ || Ի || Լ || Խ || Ծ || Կ
|-
!style="font-size:80%"| 3x
| Հ || Ձ || Ղ || Ճ || Մ || Յ || Ն || Շ
| Ո || Չ || Պ || Ջ || Ռ || Ս || Վ || Տ
|-
!style="font-size:80%"| 4x
| Ր || Ց || Ւ || Փ || Ք || Օ || Ֆ
|style="background:#CCCCCC"|
| ՝ || ՚ || ֊
|style="background:#CCCCCC"|
| ։ || , || ՞ || ՟
|-
!style="font-size:80%"| 5x
|style="background:#CCCCCC"|
| ա || բ || գ || դ || ե || զ || է
| ը || թ || ժ || ի || լ || խ || ծ || կ
|-
!style="font-size:80%"| 6x
| հ || ձ || ղ || ճ || մ || յ || ն || շ
| ո || չ || պ || ջ || ռ || ս || վ || տ
|-
!style="font-size:80%"| 7x
| ր || ց || ւ || փ || ք || օ || ֆ
|style="background:#CCCCCC"|
| ― || ‐ || ″
|style="background:#CCCCCC"|
| · || ՛ || ՜
|style="background:#CCCCCC"|
|}
For comparison, this is the 7-bit encoding in the international standard ISO/IEC 10585 standard that was used before the revision in the Armenian standard AST34.002:1997 (ArmSCII-8).
In this standard (as well as in ISO/IEC 10646 and Unicode), there's only one Armenian apostrophe modifier letter encoded at 0x49 when Armenian uses two modifier letter apostrophes which are cased (U+055A represents the capital apostrophe but is not considered dual-cased in Unicode and this ISO 15985 standard, the small letter apostrophe is absent but generally represented by the ASCII apostrophe U+0027 in Unicode documents).
The left half-ring punctuation (a modifier letter) and the eternity symbol are also missing, and only one double quotation mark (U+2033) is encoded in code value 7A instead of double guillemots in the three ArmSCII variants.
However, this standard maps the Armenian full stop (whose glyph looks very close to the ASCII colon) in code value 4C and the Armenian abbreviation mark (that looks very similar to an angular grave accent) in code value 4F, that are both missing from all ArmSCII code charts.
Note that the characters encoded at code values 49 and 4A (Armenian apostrophe and hyphen) may not be visible with all fonts supporting Armenian.
ISO/IEC 10646-1 and Unicode
For comparison, this is the Unicode code points charts for Armenian.Its encoding since Unicode 1.1 (except the Armenian hyphen U+058A, the last character added since Unicode 3.0) was based on the previous ISO 10585 7-bit international encoding standard, rather than on ArmsCII that was missing a dozen of characters present in ISO 10585; however non-letters were reorganized by type, and some extensions have been added for rare Armenian characters that were missing in all past 7-bit and 8-bit standards.
Capital letters are encoded in the first half of the block (terminated by modifier letters).
Lowercase letters are encoded in the second half of the block (terminated by Armenian punctuation signs).
Unlike the ArmSCII encodings, this encoding is stable and portable across systems, and contain all characters needed for Armenian (with the exception of the Armenian eternity sign). Some Unicode-encoded fonts for Armenian are mapping the eternity sign at code point U+0530. This is incorrect, as that code point has not been allocated.
However no distinction is kept for the Armenian (mirrored) parenthesis, so the standard ASCII/Unicode punctuation must be used according to their usual rendering. The left half-ring mark (modifier letter) is encoded here, and some other marks are unified with other scripts (notably the quotation marks, middle dot and dashes).
Note that the characters encoded at code points U+055A and U+058A (Armenian apostrophe and hyphen, like in the charts for ArmsCII and ISO 10585), and as well as U+0559 (the modifier mark for numeric, added specifically into ISO 10646-1 and Unicode), may not be visible with all fonts supporting Armenian.
As of today, the Armenian eternity symbol (present only in ArmSCII) is still not encoded in ISO 10646-1 and Unicode (some existing Unicode-encoded fonts may map the symbol on U+0530, but this is not conforming as this code point is still not formally encoded).
Code mappings and classification
Note that some transcodings are shown below between parentheses. They are only approximation fallbacks but do not map exactly the intended character.Subset | Character | Armenian description or usage | Short name | Encodings | Notes | ||||
---|---|---|---|---|---|---|---|---|---|
ArmSCII-7 | ArmSCII-8 | ArmSCII-8A | ISO 10585 | Unicode ISO/IEC 10646 | |||||
General purpose | space | space | 20 | 20 | 20 | 20 | 0020 | same as ASCII and Unicode | |
non-breaking space | nbsp | (20) | A0 | FF | (20) | 00A0 | missing in ArmSCII-7 and ISO 10585 | ||
Armenian symbols | eternity sign | armeternity | 21 | A1 | DC | — | — | missing in Unicode | |
և | ligature ech yiwn (ew) | armew | (3B,75) | (26) (or BB,F5) | (26) (or 89,F5) | (55,72) | 0587 (or 0565,0582) | specific to Armenian : compatibility ligature of Armenian ech (yech) and yiwn (vyun) small letters, used as a symbol (similar to ampersand symbol in ASCII) | |
§ | section sign | armsection | 22 | A2 | — | — | 00A7 | from ISO 8859; missing in all ArmSCII variants | |
Armenian punctuation | ։ | full stop (vertsaket) | armfullstop | 23 | A3 | (3A) | 4C | 0589 | specific to Armenian : looks mostly like ASCII colon, but distinct usage ; missing in ArmSCII-8A (approximated by ASCII colon) |
) | right parenthesis | armparenright | 24 | A4 | 29 | (79) | 0029 | from ASCII, name and usage different and Unicode ; missing in ISO 10585 (suggested substitution uses dashes) | |
( | left parenthesis | armparenleft | 25 | A5 | 28 | (79) | 0028 | from ASCII, name and usage different and Unicode ; missing in ISO 10585 (suggested substitution uses dashes) | |
» | right quotation mark | armquotright | 26 | A6 | AF | (7A) | 00BB | from ISO-8859, name and usage different and Unicode | |
« | left quotation mark | armquotleft | 27 | A7 | AE | (7A) | 00AB | from ISO-8859, name and usage different and Unicode | |
″ | quotation mark | — | — | (22) | (22) | 7A | 2033 | used for either left or right quotation mark in ISO 10585; missing in ArmSCII-8/8A (approximated by ASCII double quotation mark) | |
― | em-dash | armemdash | 28 | A8 | (5F) | 78 | 2015 | from ISO-8859; missing in ArmSCII-8A (approximated by ASCII underscore) | |
. | middle dot (mijaket) | armdot | 29 | A9 | (2E) | 7C | 2024 | sometimes similar to ASCII full stop, but usage different in Armenian where the middle dot is preferred; missing in ArmSCII-8A (approximated by ASCII full stop) | |
՝ | separation mark (but) | armsep | 2A | AA | (60) | 48 | 055D | usage specific to Armenian : used as a comma ; = bowt ; missing in ArmSCII-8A (approximated by ASCII backquote) | |
, | comma | armcomma | 2B | AB | 2C | 4D | 002C | same as ASCII and Unicode comma | |
‐ | dash | armendash | 2C | AC | (2D) | 79 | 2010 | similar to the short variant of the ASCII and Unicode minus-hyphen (shorter than the general purpose minus sign used in ASCII) ; missing in ArmSCII-8A (approximated by ASCII minus-hyphen) | |
Armenian modifier letters | ֊ | hyphen (yentamna) | armyentamna | 2D | AD | DD | 4A | 058A | specific to Armenian : a modifier letter that modifies another Armenian normal letter (possibly with combining punctuation between them) |
... | ellipsis | armellipsis | 2E | AE | DE | (7C,7C,7C) | 2026 | from ISO-8859, but not a punctuation : a modifier letter that follows and modifies another normal Armenian letter (possibly with combining punctuation between them) | |
ՙ | numeric mark (left half-ring) | armnum | — | — | — | — | 0559 | specific to Armenian : a modifier letter that modifies another Armenian normal letter (possibly with combining punctuation between them) ; missing in all ArmSCII variants | |
՚ | apostrophe (right half-ring) | armapostrophe | 7E | FE | FE | 49 | 055A | specific to Armenian : a modifier letter that modifies another Armenian normal letter (possibly with combining punctuation between them) | |
Armenian combining punctuation | ՜ | exclamation mark (amanak) | armexclam | 2F | AF | (7E) | 7E | 055C | specific to Armenian : these diacritics encode punctuation but may appear on top of a letter in the middle of any word (it may be ignored in searches); Unicode handles them as modifier letters however they are normally not spacing ; = batsaganchakan nshan ; missing in ArmSCII-8A (approximated by ASCII tilde symbol) |
՛ | emphasis mark (shesht) | armaccent | 30 | B0 | (27) | 7D | 055B | specific to Armenian : these diacritics encode punctuation but may appear on top of a letter in the middle of any word (it may be ignored in searches); Unicode handles them as modifier letters however they are normally not spacing ; missing in ArmSCII-8A (approximated by ASCII single quote) | |
՞ | question mark (paruyk) | armquestion | 31 | B1 | DF | 4E | 055E | specific to Armenian : these diacritics encode punctuation but may appear on top of a letter in the middle of any word (it may be ignored in searches); Unicode handles them as modifier letters however they are normally not spacing ; = hartsakan nshan | |
՟ | abbreviation mark (patiw) | armabbrev | — | — | — | 4F | 055F | specific to Armenian : these diacritics encode punctuation but may appear on top of a letter in the middle of any word (it may be ignored in searches); Unicode handles them as modifier letters however they are normally not spacing | |
Armenian capital letters | Ա | Ayb | Armayb | 32 | B2 | 80 | 21 | 0531 | |
Բ | Ben | Armben | 34 | B4 | 82 | 22 | 0532 | ||
Գ | Gim | Armgim | 36 | B6 | 84 | 23 | 0533 | ||
Դ | Da | Armda | 38 | B8 | 86 | 24 | 0534 | ||
Ե | Ech (Yech) | Armyech | 3A | BA | 88 | 25 | 0535 | ||
Զ | Za | Armza | 3C | BC | 8A | 26 | 0536 | ||
Է | Eh (E) | Arme | 3E | BE | 8C | 27 | 0537 | ||
Ը | Et (At) | Armat | 40 | C0 | 8E | 28 | 0538 | ||
Թ | To | Armto | 42 | C2 | 90 | 29 | 0539 | ||
Ժ | Zhe | Armzhe | 44 | C4 | 92 | 2A | 053A | ||
Ի | Ini | Armini | 46 | C6 | 94 | 2B | 053B | ||
Լ | Liwn (Lyun) | Armlyun | 48 | C8 | 96 | 2C | 053C | ||
Խ | Xeh (Khe) | Armkhe | 4A | CA | 98 | 2D | 053D | ||
Ծ | Ca (Tsa) | Armtsa | 4C | CC | 9A | 2E | 053E | ||
Կ | Ken | Armken | 4E | CE | 9C | 2F | 053F | ||
Հ | Ho | Armho | 50 | D0 | 9E | 30 | 0540 | ||
Ձ | Ja (Dza) | Armdza | 52 | D2 | A0 | 31 | 0541 | ||
Ղ | Ghad (Ghat) | Armghat | 54 | D4 | A2 | 32 | 0542 | ||
Ճ | Cheh (Tche) | Armtche | 56 | D6 | A4 | 33 | 0543 | ||
Մ | Men | Armmen | 58 | D8 | A6 | 34 | 0544 | ||
Յ | Yi (Hi) | Armhi | 5A | DA | A8 | 35 | 0545 | ||
Ն | Now (Nu) | Armnu | 5C | DC | AA | 36 | 0546 | ||
Շ | Sha | Armsha | 5E | DE | AC | 37 | 0547 | ||
Ո | Vo | Armvo | 60 | E0 | E0 | 38 | 0548 | ||
Չ | Cha | Armcha | 62 | E2 | E2 | 39 | 0549 | ||
Պ | Peh (Pe) | Armpe | 64 | E4 | E4 | 3A | 054A | ||
Ջ | Jheh (Je) | Armje | 66 | E6 | E6 | 3B | 054B | ||
Ռ | Ra | Armra | 68 | E8 | E8 | 3C | 054C | ||
Ս | Seh (Se) | Armse | 6A | EA | EA | 3D | 054D | ||
Վ | Vew (Vev) | Armvev | 6C | EC | EC | 3E | 054E | ||
Տ | Tiwn (Tyun) | Armtyun | 6E | EE | EE | 3F | 054F | ||
Ր | Reh (Re) | Armre | 70 | F0 | F0 | 40 | 0550 | ||
Ց | Co (Tso) | Armtso | 72 | F2 | F2 | 41 | 0551 | ||
Ւ | Yiwn (Vyun) | Armvyun | 74 | F4 | F4 | 42 | 0552 | ||
Փ | Piwr (Pyur) | Armpyur | 76 | F6 | F6 | 43 | 0553 | ||
Ք | Keh (Ke) | Armke | 78 | F8 | F8 | 44 | 0554 | ||
Օ | Oh (O) | Armo | 7A | FA | FA | 45 | 0555 | ||
Ֆ | Feh (Fe) | Armfe | 7C | FC | FC | 46 | 0556 | ||
Armenian small letters | ա | ayb | armayb | 33 | B3 | 81 | 51 | 0561 | |
բ | ben | armben | 35 | B5 | 83 | 52 | 0562 | ||
գ | gim | armgim | 37 | B7 | 85 | 53 | 0563 | ||
դ | da | armda | 39 | B9 | 87 | 54 | 0564 | ||
ե | ech (yech) | armyech | 3B | BB | 89 | 55 | 0565 | ||
զ | za | armza | 3D | BD | 8B | 56 | 0566 | ||
է | eh (e) | arme | 3F | BF | 8D | 57 | 0567 | ||
ը | et (at) | armat | 41 | C1 | 8F | 58 | 0568 | ||
թ | to | armto | 43 | C3 | 91 | 59 | 0569 | ||
ժ | zhe | armzhe | 45 | C5 | 93 | 5A | 056A | ||
ի | ini | armini | 47 | C7 | 95 | 5B | 056B | ||
լ | liwn (lyun) | armlyun | 49 | C9 | 97 | 5C | 056C | ||
խ | xeh (khe) | armkhe | 4B | CB | 99 | 5D | 056D | ||
ծ | ca (tsa) | armtsa | 4D | CD | 9B | 5E | 056E | ||
կ | ken | armken | 4F | CF | 9D | 5F | 056F | ||
հ | ho | armho | 51 | D1 | 9F | 60 | 0570 | ||
ձ | ja (dza) | armdza | 53 | D3 | A1 | 61 | 0571 | ||
ղ | ghad (ghat) | armghat | 55 | D5 | A3 | 62 | 0572 | ||
ճ | cheh (tche) | armtche | 57 | D7 | A5 | 63 | 0573 | ||
մ | men | armmen | 59 | D9 | A7 | 64 | 0574 | ||
յ | yi (hi) | armhi | 5B | DB | A9 | 65 | 0575 | ||
ն | now (nu) | armnu | 5D | DD | AB | 66 | 0576 | ||
շ | sha | armsha | 5F | DF | AD | 67 | 0577 | ||
ո | vo | armvo | 61 | E1 | E1 | 68 | 0578 | ||
չ | cha | armcha | 63 | E3 | E3 | 69 | 0579 | ||
պ | peh (pe) | armpe | 65 | E5 | E5 | 6A | 057A | ||
ջ | jheh (je) | armje | 67 | E7 | E7 | 6B | 057B | ||
ռ | ra | armra | 69 | E9 | E9 | 6C | 057C | ||
ս | she (se) | armse | 6B | EB | EB | 6D | 057D | ||
վ | vew (vev) | armvev | 6D | ED | ED | 6E | 057E | ||
տ | tiwn (tyun) | armtyun | 6F | EF | EF | 6F | 057F | ||
ր | reh (re) | armre | 71 | F1 | F1 | 70 | 0580 | ||
ց | co (tso) | armtso | 73 | F3 | F3 | 71 | 0581 | ||
ւ | yiwn (vyun) | armvyun | 75 | F5 | F5 | 72 | 0582 | ||
փ | piwr (pyur) | armpyur | 77 | F7 | F7 | 73 | 0583 | ||
ք | keh (ke) | armke | 79 | F9 | F9 | 74 | 0584 | ||
օ | oh (o) | armo | 7B | FB | FB | 75 | 0585 | ||
ֆ | feh (fe) | armfe | 7D | FD | FD | 76 | 0586 |
External references
- [ArmSCII] Armenian Standard Code for Information Interchange—Center of Humane Technologies "Armenian Computer", June 1991.
- [AST 34.001-97] Information Technologies—Character Set And Information Encoding: Character Set—State Standardization Committee of the Republic of Armenia, July 1997.
- [ArmSCII Version 2] Armenian Standard Code for Information Interchange, Version 2—ArmSCII Working Group, May 1999.
Related articles
- Armenian alphabetArmenian alphabetThe Armenian alphabet is an alphabet that has been used to write the Armenian language since the year 405 or 406. It was devised by Saint Mesrop Mashtots, an Armenian linguist and ecclesiastical leader, and contained originally 36 letters. Two more letters, օ and ֆ, were added in the Middle Ages...
- Armenian languageArmenian languageThe Armenian language is an Indo-European language spoken by the Armenian people. It is the official language of the Republic of Armenia as well as in the region of Nagorno-Karabakh. The language is also widely spoken by Armenian communities in the Armenian diaspora...
- Romanization of ArmenianRomanization of Armenian- Hübschmann-Meillet :In linguistic literature on Classical Armenian, the commonly used transliteration is that of Hübschmann-Meillet .It uses a dot above mark to express the aspirates, t῾, ch῾, č῾, p῾, k῾...
(including ISO 9985 standard) - Traditional Armenian orthographyTraditional Armenian orthographyTraditional Armenian orthography is the orthography developed during the early 19th century for the two modern dialects of the Armenian language - Eastern Armenian and Western Armenian...
- Reformed Armenian orthography
- Armenian calendarArmenian calendarThe Armenian calendar is the traditional calendar of Armenia. It is a solar calendar based on the same system as the ancient Egyptian model, having an invariant 365-day year with no leap year rule...
:Category:ISO standards (646, 9985, 10585 and 10646-1)