Code page 932
Encyclopedia
Code page
Code page
Code page is another term for character encoding. It consists of a table of values that describes the character set for a particular language. The term code page originated from IBM's EBCDIC-based mainframe systems, but many vendors use this term including Microsoft, SAP, and Oracle Corporation...

 932
(abbreviated as CP932, also known by the IANA
Internet Assigned Numbers Authority
The Internet Assigned Numbers Authority is the entity that oversees global IP address allocation, autonomous system number allocation, root zone management in the Domain Name System , media types, and other Internet Protocol-related symbols and numbers...

 name Windows-31J) is Microsoft's extension of Shift JIS to include NEC special characters (Row 13), NEC selection of IBM extensions (Rows 89 to 92), and IBM extensions (Rows 115 to 119). The coded character sets are JIS X0201:1997, JIS X0208:1997, and these extensions. Windows-31J is often mistaken for Shift JIS: while similar, the distinction is significant for computer programmers wishing to avoid mojibake
Mojibake
, from the Japanese 文字 "character" + 化け "change", is the occurrence of incorrect, unreadable characters shown when computer software fails to render text correctly according to its associated character encoding.-Causes:...

, and a good reason to use the unambiguous UTF-8 instead. The windows-31J name however is IANA's and not recognized by Microsoft, which historically has used shift_jis instead.

In Japanese editions of Windows, this code page is referred to as "ANSI", since it is the operating system's default 8-bit encoding, even though ANSI
Ansi
Ansi is a village in Kaarma Parish, Saare County, on the island of Saaremaa, Estonia....

 was not involved in its definition.

Code page 932 contains standard 7-bit ASCII
ASCII
The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...

codes, and Japanese characters are indicated by the high bit set to 1. Some code points in this page require a second byte, so characters use either 8 or 16 bits for encoding.

Notice that in the CP932.TXT mapping table linked below, code 0x5C is mapped to U+005C REVERSE SOLIDUS (\). This is often a source of confusion because in many Japanese fonts, this code is displayed as a Yen symbol, which would normally be represented as U+00A5 YEN SIGN (¥) in Unicode. However, on Windows systems, code 0x5C in code page 932 behaves as a reverse solidus (backslash) in all respects other than how it is displayed by some fonts.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK