Arial Unicode MS
Encyclopedia
In digital typography
, the TrueType
font Arial Unicode MS is an extended version of the font
Arial
. Compared to Arial, it includes higher line height, omits kerning pairs and adds enough glyphs to cover a large subset of Unicode
2.1—thus supporting most Microsoft
code page
s, but also requiring much more storage
space (22 megabyte
s). It also adds Ideographic
layout tables, but unlike Arial, it mandates no smoothing
in the 14–18 point
range, and contains Roman (upright) glyphs only; there is no oblique (italic
) version. Arial Unicode MS is normally distributed with Microsoft Office
, but it is also bundled with Mac OS X v10.5
and later. It may also be purchased separately (as Arial Unicode) from Ascender Corporation
, who licenses the font from Microsoft.
When rendered with the same engine and without making adjustments for the different font metrics, the glyphs that appear in both Arial and Arial Unicode MS appear to be slightly wider, and thus rounder, in Arial Unicode MS. Horizontal text may also appear to have more inter-line spacing in Arial Unicode MS. This is due to larger bounding boxes (Arial Unicode MS needs more room for some of its extended glyphs) and the limitations of renderers, not changes in the glyph shapes. The lack of kerning pairs in Arial Unicode MS may also affect inter-glyph spacing in some renderers (for example the Adobe Flash Player).
Arial Unicode MS also includes Hebrew glyphs different from the Hebrew glyphs found in Arial. They are based on the shapes of the Hebrew glyphs in Tahoma
, but are adjusted to the weight, proportions and style of Arial.
's Monotype Type Drawing Office, under contract to Microsoft: Brian Allen, Evert Bloemsma, Jelle Bosma, Joshua Hadley, Wallace Ho, Kamal Mansour, Steve Matteson, and Thomas Rickner.
From mid-2001 through mid-2002, Arial Unicode MS was also available as a separate download for licensed users of the standalone version of Microsoft Publisher
2000 SR-1, which did not ship with the font. The freely downloadable version was withdrawn after Microsoft Publisher 2002, which included the font, began shipping. The withdrawal coincided with the withdrawal of the free downloads of Microsoft's "Core fonts for the Web
". Numerous companies, organizations, educational establishments and even governments were directing users to the download without referencing the need for a valid Publisher or Office license or any Microsoft operating system.
Monotype Imaging still owns the Arial and Arial Unicode MS trademarks, but Microsoft once retained exclusive licensing rights to the fonts.
On 2005-04-11, Ascender Corporation
announced it had entered an agreement with Microsoft which enables Ascender to distribute Microsoft fonts, including the Windows Core Fonts, the Microsoft Web Fonts and the many multilingual fonts currently supplied by Microsoft. Called Arial Unicode, it is sold for approximately $
99 per 5 users.
The font is also apparently licensed to Apple, who announced on October 16, 2007 that their flagship operating system, Mac OS X v10.5
("Leopard"), would be bundled with Arial Unicode. Leopard also ships with several other previously Microsoft-only fonts, including Microsoft Sans Serif, Tahoma
and Wingdings
.
Monotype Imaging currently also licenses Arial Unicode on its own. It was also bundled by Monotype as part of iPhone Compatibility Font Set.
layout tables. The code pages supported are 1250 (Latin 2: East Europe), 1251 (Cyrillic), 1252 (Latin 1), 1253 (Greek), 1254 (Turkish), 1255 (Hebrew), 1256 (Arabic), 1257 (Windows Baltic), Code page 1258 (Vietnamese), 437
(US), 708 (Arabic; ASMO 708), 737
(Greek), 775
(MS-DOS Baltic), 850
(WE/Latin 1), 852
(Latin 2), 855
(IBM Cyrillic; primarily Russian), 857
(MS-DOS IBM Turkish), 860
(MS-DOS Portuguese), 861
(MS-DOS Icelandic), 862
(Hebrew), 863
(MS-DOS Canadian French), 864 (Arabic), 865
(MS-DOS Nordic), 866
(MS-DOS Russian), 869
(IBM Greek), 874 (Thai), 932
(JIS/Japan), 936
(Chinese: Simplified), 949
(Korean Wansung), 950
(Chinese: Traditional), "Macintosh Character Set" (US Roman), and "Windows OEM Character Set
". It covers all code points containing non-control characters in Unicode 2.0.
Version 0.86 has the same coverage and support as 0.84.
Versions 1.00 and 1.01 were supplied with Microsoft Office 2002 (Microsoft Office XP), Microsoft Office 2003 and the standalone versions of those suites' applications. It includes 50,377 glyphs (38,917 characters), which reduces Combining Diacritical Marks to 72, increases Miscellaneous Technical characters to 123, increases Private Use Area characters to 43, reduces Spacing Modifier Letters to 57. Code page 1361 (Korean Johab) was added. It adds layout tables for Devanagari, Gujarati, Gurmukhi, Kana (Hiragana & Katakana), Kannada, and Tamil. Its Han Ideographic tables were updated to support vertical writing. It covers all code points containing non-control characters in Unicode 2.1.
incorrectly, drawing them too far to the left by one character width. According to the Unicode Standard 4.0.0, section 7.7 combining double diacritics go between the two characters to be marked. However, to make text look correct in Arial Unicode MS, the double-width diacritic must be placed after both characters to be marked. This means that it is not possible to make text that renders these characters correctly in both Arial Unicode MS and in other (correctly designed) Unicode fonts. This bug affects the rendering of text written in the International Phonetic Alphabet
and in ALA-LC Romanization for non-Latin-script languages. If the displayed font in your browser draws the diacritics correctly, they should appear over the characters: k͠p, k͡p.
The letters in Latin small ligatures fi, fl, ffi, ffl, long st, st aren't connected, except for the two f's in the ffi and ffl ligatures. The ligatures must be connected with the cut contours action. However, there is nothing mandating that these must be connected - while they are indistinguishable from the individual letters placed next to each other, there is no semantic difference between the ligature and the individual characters.
The of Arial Unicode MS has 6 points instead of 5.
, TITUS Cyberbit Basic, Code2000
, Doulos SIL
, Lucida Sans Unicode
, Free software Unicode typefaces
, and Unicode fonts
.
Typography
Typography is the art and technique of arranging type in order to make language visible. The arrangement of type involves the selection of typefaces, point size, line length, leading , adjusting the spaces between groups of letters and adjusting the space between pairs of letters...
, the TrueType
TrueType
TrueType is an outline font standard originally developed by Apple Computer in the late 1980s as a competitor to Adobe's Type 1 fonts used in PostScript...
font Arial Unicode MS is an extended version of the font
Typeface
In typography, a typeface is the artistic representation or interpretation of characters; it is the way the type looks. Each type is designed and there are thousands of different typefaces in existence, with new ones being developed constantly....
Arial
Arial
Arial, sometimes marketed or displayed in software as Arial MT, is a sans-serif typeface and set of computer fonts. Fonts from the Arial family are packaged with Microsoft Windows, some other Microsoft software applications, Apple Mac OS X and many PostScript 3 computer printers...
. Compared to Arial, it includes higher line height, omits kerning pairs and adds enough glyphs to cover a large subset of Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...
2.1—thus supporting most Microsoft
Microsoft
Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...
code page
Code page
Code page is another term for character encoding. It consists of a table of values that describes the character set for a particular language. The term code page originated from IBM's EBCDIC-based mainframe systems, but many vendors use this term including Microsoft, SAP, and Oracle Corporation...
s, but also requiring much more storage
Computer storage
Computer data storage, often called storage or memory, refers to computer components and recording media that retain digital data. Data storage is one of the core functions and fundamental components of computers....
space (22 megabyte
Megabyte
The megabyte is a multiple of the unit byte for digital information storage or transmission with two different values depending on context: bytes generally for computer memory; and one million bytes generally for computer storage. The IEEE Standards Board has decided that "Mega will mean 1 000...
s). It also adds Ideographic
Ideogram
An ideogram or ideograph is a graphic symbol that represents an idea or concept. Some ideograms are comprehensible only by familiarity with prior convention; others convey their meaning through pictorial resemblance to a physical object, and thus may also be referred to as pictograms.Examples of...
layout tables, but unlike Arial, it mandates no smoothing
Anti-aliasing
In digital signal processing, spatial anti-aliasing is the technique of minimizing the distortion artifacts known as aliasing when representing a high-resolution image at a lower resolution...
in the 14–18 point
Point (typography)
In typography, a point is the smallest unit of measure, being a subdivision of the larger pica. It is commonly abbreviated as pt. The point has long been the usual unit for measuring font size and leading and other minute items on a printed page....
range, and contains Roman (upright) glyphs only; there is no oblique (italic
Italic type
In typography, italic type is a cursive typeface based on a stylized form of calligraphic handwriting. Owing to the influence from calligraphy, such typefaces often slant slightly to the right. Different glyph shapes from roman type are also usually used—another influence from calligraphy...
) version. Arial Unicode MS is normally distributed with Microsoft Office
Microsoft Office
Microsoft Office is a non-free commercial office suite of inter-related desktop applications, servers and services for the Microsoft Windows and Mac OS X operating systems, introduced by Microsoft in August 1, 1989. Initially a marketing term for a bundled set of applications, the first version of...
, but it is also bundled with Mac OS X v10.5
Mac OS X v10.5
Mac OS X Leopard is the sixth major release of Mac OS X, Apple's desktop and server operating system for Macintosh computers. Leopard was released on 26 October 2007 as the successor of Tiger , and is available in two variants: a desktop version suitable for personal computers, and a...
and later. It may also be purchased separately (as Arial Unicode) from Ascender Corporation
Ascender Corporation
Ascender Corporation is a digital typeface foundry and software development company located in the Chicago suburb of Elk Grove Village, Illinois in the United States...
, who licenses the font from Microsoft.
When rendered with the same engine and without making adjustments for the different font metrics, the glyphs that appear in both Arial and Arial Unicode MS appear to be slightly wider, and thus rounder, in Arial Unicode MS. Horizontal text may also appear to have more inter-line spacing in Arial Unicode MS. This is due to larger bounding boxes (Arial Unicode MS needs more room for some of its extended glyphs) and the limitations of renderers, not changes in the glyph shapes. The lack of kerning pairs in Arial Unicode MS may also affect inter-glyph spacing in some renderers (for example the Adobe Flash Player).
Arial Unicode MS also includes Hebrew glyphs different from the Hebrew glyphs found in Arial. They are based on the shapes of the Hebrew glyphs in Tahoma
Tahoma
Tahoma is the original form of the word "Tacoma", as in the city of Tacoma, Washington. It can refer to:Places:* Mount Tahoma, an alternative spelling of Mount Tacoma, the original name of Mount Rainier in the Cascade Range...
, but are adjusted to the weight, proportions and style of Arial.
History and availability
Arial was designed by Robin Nicholas and Patricia Saunders in 1982 and was released as TrueType font in 1990. From 1993 to 1999, it was extended as Arial Unicode MS (with its first release as a TrueType font in 1998) by the following members of Monotype TypographyMonotype Corporation
Monotype Imaging Holdings is a Delaware corporation based in Woburn, Massachusetts and specializing in typesetting and typeface design as well as text and imaging solutions for use with consumer electronics devices. Monotype Imaging Holdings is the owner of Monotype Imaging Inc., Linotype,...
's Monotype Type Drawing Office, under contract to Microsoft: Brian Allen, Evert Bloemsma, Jelle Bosma, Joshua Hadley, Wallace Ho, Kamal Mansour, Steve Matteson, and Thomas Rickner.
From mid-2001 through mid-2002, Arial Unicode MS was also available as a separate download for licensed users of the standalone version of Microsoft Publisher
Microsoft Publisher
Microsoft Publisher is a desktop publishing application from Microsoft. It is an entry-level application, differing from Microsoft Word in that the emphasis is placed on page layout and design rather than text composition and proofing...
2000 SR-1, which did not ship with the font. The freely downloadable version was withdrawn after Microsoft Publisher 2002, which included the font, began shipping. The withdrawal coincided with the withdrawal of the free downloads of Microsoft's "Core fonts for the Web
Core fonts for the Web
Core fonts for the Web was a project begun by Microsoft in 1996 to make a standard pack of fonts for the Internet. It was terminated in 2002. It included the proprietary fonts Andale Mono, Arial, Arial Black, Comic Sans MS, Courier New, Georgia, Impact, Times New Roman, Trebuchet MS, Verdana and...
". Numerous companies, organizations, educational establishments and even governments were directing users to the download without referencing the need for a valid Publisher or Office license or any Microsoft operating system.
Monotype Imaging still owns the Arial and Arial Unicode MS trademarks, but Microsoft once retained exclusive licensing rights to the fonts.
On 2005-04-11, Ascender Corporation
Ascender Corporation
Ascender Corporation is a digital typeface foundry and software development company located in the Chicago suburb of Elk Grove Village, Illinois in the United States...
announced it had entered an agreement with Microsoft which enables Ascender to distribute Microsoft fonts, including the Windows Core Fonts, the Microsoft Web Fonts and the many multilingual fonts currently supplied by Microsoft. Called Arial Unicode, it is sold for approximately $
United States dollar
The United States dollar , also referred to as the American dollar, is the official currency of the United States of America. It is divided into 100 smaller units called cents or pennies....
99 per 5 users.
The font is also apparently licensed to Apple, who announced on October 16, 2007 that their flagship operating system, Mac OS X v10.5
Mac OS X v10.5
Mac OS X Leopard is the sixth major release of Mac OS X, Apple's desktop and server operating system for Macintosh computers. Leopard was released on 26 October 2007 as the successor of Tiger , and is available in two variants: a desktop version suitable for personal computers, and a...
("Leopard"), would be bundled with Arial Unicode. Leopard also ships with several other previously Microsoft-only fonts, including Microsoft Sans Serif, Tahoma
Tahoma (typeface)
Tahoma is a humanist sans-serif typeface designed by Matthew Carter for the Microsoft Corporation in 1994 with initial distribution along with Verdana for Windows 95....
and Wingdings
Wingdings
Wingdings are a series of dingbat fonts which render letters as a variety of symbols. They were originally developed in 1990 by Microsoft by combining glyphs from Lucida Icons, Arrows, and Stars licensed from Charles Bigelow and Kris Holmes...
.
Monotype Imaging currently also licenses Arial Unicode on its own. It was also bundled by Monotype as part of iPhone Compatibility Font Set.
Versions
Version 0.84 was supplied with Microsoft Office 2000 and the standalone versions of that suite's applications—except Publisher 2000 SR-1. It includes 51180 glyphs (38911 characters), supports 32 code pages, and contains Latin and Han Ideographic OpenTypeOpenType
OpenType is a format for scalable computer fonts. It was built on its predecessor TrueType, retaining TrueType's basic structure and adding many intricate data structures for prescribing typographic behavior...
layout tables. The code pages supported are 1250 (Latin 2: East Europe), 1251 (Cyrillic), 1252 (Latin 1), 1253 (Greek), 1254 (Turkish), 1255 (Hebrew), 1256 (Arabic), 1257 (Windows Baltic), Code page 1258 (Vietnamese), 437
Code page 437
IBM PC or MS-DOS code page 437 is the character set of the original IBM PC. It is also known as CP 437, OEM 437, PC-8, MS-DOS Latin US or sometimes misleadingly referred to as the OEM font, High ASCII or Extended ASCII....
(US), 708 (Arabic; ASMO 708), 737
Code page 737
Code page 737 is a code page used under MS-DOS to write Greek language. It was much more popular than code page 869.-Code page layout:...
(Greek), 775
Code page 775
Code page 775 is a code page used under MS-DOS to write the Estonian, Lithuanian and Latvian languages.-Code page layout:...
(MS-DOS Baltic), 850
Code page 850
Code page 850 is a code page used under MS-DOS in Western Europe. It is the code page commonly used by the version of MS-DOS underlying Windows ME...
(WE/Latin 1), 852
Code page 852
Code page 852 is a code page used under MS-DOS to write Central European languages that use Latin script ....
(Latin 2), 855
Code page 855
Code page 855 is a code page used under MS-DOS to write Cyrillic script. This code page is not used much.-Code page layout:...
(IBM Cyrillic; primarily Russian), 857
Code page 857
Code page 857 is a code page used under MS-DOS to write Turkish.Code page 857 is based on code page 850, but with many changes. It includes all characters from ISO 8859-9.-Code page layout:...
(MS-DOS IBM Turkish), 860
Code page 860
Code page 860 is a code page used under MS-DOS to write Portuguese.-Code page layout:...
(MS-DOS Portuguese), 861
Code page 861
Code page 861 is a code page used under MS-DOS to write the Icelandic language .-Code page layout:...
(MS-DOS Icelandic), 862
Code page 862
Code page 862 is a code page used under MS-DOS for Hebrew.Like ISO 8859-8, it encodes only letters, not vowel-points or cantillation marks...
(Hebrew), 863
Code page 863
Code page 863 is a code page used under MS-DOS to write French language .-Code page layout:...
(MS-DOS Canadian French), 864 (Arabic), 865
Code page 865
Code page 865 is a code page used under MS-DOS to write Nordic languages ....
(MS-DOS Nordic), 866
Code page 866
Code page 866 is a code page used under MS-DOS to write Cyrillic script. It is based on the "alternative character set" of GOST 19768-87...
(MS-DOS Russian), 869
Code page 869
Code page 869 is a code page used under MS-DOS to write Greek language. It is also called MS-DOS Greek 2. It was designed to include all characters from ISO 8859-7.Code page 869 was not as popular as code page 737....
(IBM Greek), 874 (Thai), 932
Code page 932
Code page 932 is Microsoft's extension of Shift JIS to include NEC special characters , NEC selection of IBM extensions , and IBM extensions . The coded character sets are JIS X0201:1997, JIS X0208:1997, and these extensions...
(JIS/Japan), 936
Code page 936
Code page 936 is Microsoft's character encoding for simplified Chinese, one of the four DBCSs for East Asian languages. Originally it was identical to GB 2312, and expanded to cover most part of GBK with the release of Windows 95; now superseded by Code page 54936 .-External links:**...
(Chinese: Simplified), 949
Code page 949
Code page 949 is Microsoft's implementation that appears similar to EUC-KR. This code page supports the Korean language. The code page is not registered with IANA, and hence, is not a standard to communicate information over the Internet, although it's often used for that. UTF-8 is much preferred...
(Korean Wansung), 950
Code page 950
Code page 950 is Microsoft's implementation of the de facto standard Big5. The code page is not registered with IANA, and hence, is not a standard to communicate information over the internet. The major difference between code page 950 and Big5 is the incorporation of some ETEN characters at...
(Chinese: Traditional), "Macintosh Character Set" (US Roman), and "Windows OEM Character Set
Code page 437
IBM PC or MS-DOS code page 437 is the character set of the original IBM PC. It is also known as CP 437, OEM 437, PC-8, MS-DOS Latin US or sometimes misleadingly referred to as the OEM font, High ASCII or Extended ASCII....
". It covers all code points containing non-control characters in Unicode 2.0.
Version 0.86 has the same coverage and support as 0.84.
Versions 1.00 and 1.01 were supplied with Microsoft Office 2002 (Microsoft Office XP), Microsoft Office 2003 and the standalone versions of those suites' applications. It includes 50,377 glyphs (38,917 characters), which reduces Combining Diacritical Marks to 72, increases Miscellaneous Technical characters to 123, increases Private Use Area characters to 43, reduces Spacing Modifier Letters to 57. Code page 1361 (Korean Johab) was added. It adds layout tables for Devanagari, Gujarati, Gurmukhi, Kana (Hiragana & Katakana), Kannada, and Tamil. Its Han Ideographic tables were updated to support vertical writing. It covers all code points containing non-control characters in Unicode 2.1.
Bugs
All versions of Arial Unicode MS deal with double-width diacritic charactersDiacritic
A diacritic is a glyph added to a letter, or basic glyph. The term derives from the Greek διακριτικός . Diacritic is both an adjective and a noun, whereas diacritical is only an adjective. Some diacritical marks, such as the acute and grave are often called accents...
incorrectly, drawing them too far to the left by one character width. According to the Unicode Standard 4.0.0, section 7.7 combining double diacritics go between the two characters to be marked. However, to make text look correct in Arial Unicode MS, the double-width diacritic must be placed after both characters to be marked. This means that it is not possible to make text that renders these characters correctly in both Arial Unicode MS and in other (correctly designed) Unicode fonts. This bug affects the rendering of text written in the International Phonetic Alphabet
International Phonetic Alphabet
The International Phonetic Alphabet "The acronym 'IPA' strictly refers [...] to the 'International Phonetic Association'. But it is now such a common practice to use the acronym also to refer to the alphabet itself that resistance seems pedantic...
and in ALA-LC Romanization for non-Latin-script languages. If the displayed font in your browser draws the diacritics correctly, they should appear over the characters: k͠p, k͡p.
The letters in Latin small ligatures fi, fl, ffi, ffl, long st, st aren't connected, except for the two f's in the ffi and ffl ligatures. The ligatures must be connected with the cut contours action. However, there is nothing mandating that these must be connected - while they are indistinguishable from the individual letters placed next to each other, there is no semantic difference between the ligature and the individual characters.
The of Arial Unicode MS has 6 points instead of 5.
See also
Other well-known fonts with Unicode coverage include Bitstream CyberbitBitstream Cyberbit
Bitstream Cyberbit is a commercial Unicode font designed by Bitstream Inc. It is freeware for non-commercial uses. It was historically one of the first widely available fonts with support for a large proportion of the Unicode repertoire....
, TITUS Cyberbit Basic, Code2000
Code2000
Code2000 is a pan-Unicode digital font, which includes characters and symbols from a very large range of writing systems. As of the current final version 1.171 released in 2008, Code2000 is designed and implemented by James Kass to include as much of the Unicode 5.2 standard as practical , and to...
, Doulos SIL
Doulos SIL
Doulos SIL is a serif typeface developed by SIL International, very similar to Times or Times New Roman. Unlike Times New Roman, Doulos only has a single face, Regular...
, Lucida Sans Unicode
Lucida Sans Unicode
In digital typography, Lucida Sans Unicode OpenType font from the design studio of Bigelow & Holmes is designed to support the most commonly used characters defined in version 2.0 of the Unicode standard...
, Free software Unicode typefaces
Free software Unicode typefaces
A few projects exist to provide free and open-source Unicode typefaces, i.e. Unicode typefaces which are open-source and designed to contain glyphs of all Unicode characters. However there are also numerous projects aimed at providing only a certain script, such as the Arabeyes Arabic font...
, and Unicode fonts
Unicode typefaces
A Unicode font is a computer font that contains a wide range of characters, letters, digits, glyphs, symbols, ideograms, logograms, etc., which are collectively mapped into the standard Universal Character Set, derived from many different languages and scripts from around the world...
.