JIS encoding
Encyclopedia
In computing, JIS encoding refers to several Japanese Industrial Standards for encoding
Character encoding
A character encoding system consists of a code that pairs each character from a given repertoire with something else, such as a sequence of natural numbers, octets or electrical pulses, in order to facilitate the transmission of data through telecommunication networks or storage of text in...

 the Japanese language
Japanese language
is a language spoken by over 130 million people in Japan and in Japanese emigrant communities. It is a member of the Japonic language family, which has a number of proposed relationships with other languages, none of which has gained wide acceptance among historical linguists .Japanese is an...

. Strictly speaking, the term means either:
  • A set of standard character sets for Japanese, notably:
    • JIS X 0201
      JIS X 0201
      JIS X 0201, a Japanese Industrial Standard developed in 1969 , was the first Japanese character set to become widely used. It is either 7-bit encoding or 8-bit encoding, although 8-bit encoding is dominant for modern use...

      , the Japanese version of ISO 646 (ASCII
      ASCII
      The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...

      ) containing the base 7-bit ASCII characters (with some modifications) and 64 half-width katakana characters.
    • JIS X 0208
      JIS X 0208
      JIS X 0208 is a 2-byte character set specified as a Japanese Industrial Standard, containing 6879 graphic characters suitable for writing text, place names, personal names, and so forth in the Japanese language. The official title of the current standard is...

      , the most common kanji
      Kanji
      Kanji are the adopted logographic Chinese characters hanzi that are used in the modern Japanese writing system along with hiragana , katakana , Indo Arabic numerals, and the occasional use of the Latin alphabet...

       character set containing 6,879 kanji
    • JIS X 0212
      JIS X 0212
      JIS X 0212 is a Japanese Industrial Standard defining coded character set for encoding the characters used in Japanese. This standard extends JIS X 0208.-History:...

      , a character set containing 6,067 characters
    • JIS X 0213
      JIS X 0213
      JIS X 0213 is a Japanese Industrial Standard defining coded character sets for encoding the characters used in Japan. This standard extends JIS X 0208. The first version was published in 2000 and revised in 2004 . As well as adding a number of special characters, characters with diacritic marks,...

      , which extends JIS X 0208
  • JIS X 0202 (also known as ISO-2022-JP), a set of encoding mechanisms for sending JIS data over transmission mediums that only support 7-bit data.


In practice, "JIS encoding" usually refers to JIS X 0208 data encoded with JIS X 0202.

There is also the Shift JIS encoding, which adds the kanji, full-width hiragana and full-width katakana from JIS X 0208 in a compatible way to JIS X 0201. Shift JIS is perhaps the most widely used encoding in Japan, as the compatibility with the single-byte JIS X 0201 character set made it possible for electronic equipment manufacturers (such as cash register manufacturers) to offer an upgrade from older cheaper equipment that was not capable of displaying kanji to newer equipment while retaining character-set compatibility.

The main alternatives to JIS encoding are EUC
Extended Unix Code
Extended Unix Code is a multibyte character encoding system used primarily for Japanese, Korean, and simplified Chinese.The structure of EUC is based on the ISO-2022 standard, which specifies a way to represent character sets containing a maximum of 94 characters, or 8836 characters, or 830584 ...

 (used on UNIX
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...

 systems where the JIS encodings are incompatible with POSIX
POSIX
POSIX , an acronym for "Portable Operating System Interface", is a family of standards specified by the IEEE for maintaining compatibility between operating systems...

 standards) and more recently Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...

, particularly in the form of UTF-8
UTF-8
UTF-8 is a multibyte character encoding for Unicode. Like UTF-16 and UTF-32, UTF-8 can represent every character in the Unicode character set. Unlike them, it is backward-compatible with ASCII and avoids the complications of endianness and byte order marks...

.

See also

  • Japanese language and computers
    Japanese language and computers
    In relation to the Japanese language and computers many adaptation issues arise, some unique to Japanese and others common to languages which have a very large number of characters. The number of characters needed in order to write English is very small, and thus it is possible to use only one byte...

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK