Combining grapheme joiner
Encyclopedia
The combining grapheme joiner (CGJ), is a Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...

 character that has no visible glyph and is "default ignorable" by applications. Its name is a misnomer which does not describe the function of this character. Despite its name, it does not join graphemes.http://unicode.org/notes/tn27/ Its purpose is to separate characters that should not be considered digraphs
Digraph (orthography)
A digraph or digram is a pair of characters used to write one phoneme or a sequence of phonemes that does not correspond to the normal values of the two characters combined...

.

For example, in a Hungarian language
Hungarian language
Hungarian is a Uralic language, part of the Ugric group. With some 14 million speakers, it is one of the most widely spoken non-Indo-European languages in Europe....

 context, adjoining characters c and s would normally be considered equivalent to the cs digraph. If they are separated by the CGJ, they will be considered as two separate graphemes.

It is also needed for complex scripts. For example, in most cases the Hebrew cantillation
Cantillation
Cantillation is the ritual chanting of readings from the Hebrew Bible in synagogue services. The chants are written and notated in accordance with the special signs or marks printed in the Masoretic text of the Hebrew Bible to complement the letters and vowel points...

 accent Metheg is supposed to appear to the left of the vowel point and by default most display systems will render it like this even if it is typed before the vowel. But in some words in Biblical Hebrew the Metheg appears to the right of the vowel, and to tell the display engine to render it properly on the right, CGJ must be typed between the Metheg and the vowel. Compare:
he + pathah + metheg
he + metheg + pathah
he + metheg + CGJ + pathah

(The examples in the table may not be supported if you don't have a font that properly supports Hebrew cantillation display. Ezra SIL SR is recommended.)

In the case of several consecutive combining diacritics, an intervening CGJ indicates that they should not be subject to canonical reordering.

Compare to this the "zero-width non-joiner
Zero-width non-joiner
The zero-width non-joiner is a non-printing character used in the computerization of writing systems that make use of ligatures. When placed between two characters that would otherwise be connected into a ligature, a ZWNJ causes them to be printed in their final and initial forms, respectively...

" (as it were a space mark of width zero) at U+200C in the General Punctuation range.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK