Cedilla
Encyclopedia
A cedilla also known as cedilha or cédille, is a hook ( ) added under certain letters as a diacritical mark
Diacritic
A diacritic is a glyph added to a letter, or basic glyph. The term derives from the Greek διακριτικός . Diacritic is both an adjective and a noun, whereas diacritical is only an adjective. Some diacritical marks, such as the acute and grave are often called accents...

 to modify their pronunciation.

Origin

The tail originated in Spain as the bottom half of a miniature cursive
Cursive
Cursive, also known as joined-up writing, joint writing, or running writing, is any style of handwriting in which the symbols of the language are written in a simplified and/or flowing manner, generally for the purpose of making writing easier or faster...

 z
Z
Z is the twenty-sixth and final letter of the basic modern Latin alphabet.-Name and pronunciation:In most dialects of English, the letter's name is zed , reflecting its derivation from the Greek zeta but in American English, its name is zee , deriving from a late 17th century English dialectal...

. The word "cedilla" is the diminutive
Diminutive
In language structure, a diminutive, or diminutive form , is a formation of a word used to convey a slight degree of the root meaning, smallness of the object or quality named, encapsulation, intimacy, or endearment...

 of the Old Spanish
History of the Spanish language
The language known today as Spanish is derived from a dialect of spoken Latin that developed in the north-central part of the Iberian Peninsula in what is now northern Spain. Over the past 1,000 years, the language expanded south to the Mediterranean Sea, and was later transferred to the Spanish...

 name for this letter, ceda
Zeta (letter)
Zeta is the sixth letter of the Greek alphabet. In the system of Greek numerals, it has a value of 7. It was derived from the Phoenician letter Zayin...

(zeta). Modern Spanish, however, no longer uses this diacritic, although it is still current in Portuguese
Portuguese language
Portuguese is a Romance language that arose in the medieval Kingdom of Galicia, nowadays Galicia and Northern Portugal. The southern part of the Kingdom of Galicia became independent as the County of Portugal in 1095...

, Catalan
Catalan language
Catalan is a Romance language, the national and only official language of Andorra and a co-official language in the Spanish autonomous communities of Catalonia, the Balearic Islands and Valencian Community, where it is known as Valencian , as well as in the city of Alghero, on the Italian island...

, Occitan, and French
French language
French is a Romance language spoken as a first language in France, the Romandy region in Switzerland, Wallonia and Brussels in Belgium, Monaco, the regions of Quebec and Acadia in Canada, and by various communities elsewhere. Second-language speakers of French are distributed throughout many parts...

, which gives English
English language
English is a West Germanic language that arose in the Anglo-Saxon kingdoms of England and spread into what was to become south-east Scotland under the influence of the Anglian medieval kingdom of Northumbria...

 the alternative spellings of cedille, from French
French language
French is a Romance language spoken as a first language in France, the Romandy region in Switzerland, Wallonia and Brussels in Belgium, Monaco, the regions of Quebec and Acadia in Canada, and by various communities elsewhere. Second-language speakers of French are distributed throughout many parts...

 "cédille", and the Portuguese
Portuguese language
Portuguese is a Romance language that arose in the medieval Kingdom of Galicia, nowadays Galicia and Northern Portugal. The southern part of the Kingdom of Galicia became independent as the County of Portugal in 1095...

 form cedilha. An obsolete spelling of cedilla is cerilla. The earliest use in English cited by the Oxford English Dictionary
Oxford English Dictionary
The Oxford English Dictionary , published by the Oxford University Press, is the self-styled premier dictionary of the English language. Two fully bound print editions of the OED have been published under its current name, in 1928 and 1989. The first edition was published in twelve volumes , and...

is a 1599 Spanish-English dictionary and grammar. Chambers' Cyclopædia is cited for the printer-trade variant ceceril in use in 1738. The main use in English is not universal and applies to loan words from French
French language
French is a Romance language spoken as a first language in France, the Romandy region in Switzerland, Wallonia and Brussels in Belgium, Monaco, the regions of Quebec and Acadia in Canada, and by various communities elsewhere. Second-language speakers of French are distributed throughout many parts...

 and Portuguese
Portuguese language
Portuguese is a Romance language that arose in the medieval Kingdom of Galicia, nowadays Galicia and Northern Portugal. The southern part of the Kingdom of Galicia became independent as the County of Portugal in 1095...

 such as "cachaça
Cachaça
Cachaça is a liquor made from fermented sugarcane.It is the most popular distilled alcoholic beverage in Brazil. It is also known as aguardente, pinga, caninha and many other names...

", "limaçon
Limaçon
In geometry, a limaçon or limacon , also known as a limaçon of Pascal, is defined as a roulette formed when a circle rolls around the outside of a circle of equal radius. It can also be defined as the roulette formed when a circle rolls around a circle with half its radius so that the smaller...

" and "façade
Facade
A facade or façade is generally one exterior side of a building, usually, but not always, the front. The word comes from the French language, literally meaning "frontage" or "face"....

" (often typed "cachaca", "limacon" and "facade" due to lack of Ç keys on the keyboards of most Anglophone countries).

Encodings

In Unicode, the symbol is .

Many precomposed characters are defined too, like:

Use of the cedilla with the letter C

The most frequent character with cedilla is "ç
Ç
is a Latin script letter, used in the Albanian, Azerbaijani, Ligurian, Tatar, Turkish, Turkmen, Kurdish and Zazaki alphabets. This letter also appears in Catalan, French, Friulian, Occitan and Portuguese as a variant of the letter “c”...

" ("c" with cedilla, as in façade). It was first used for the sound of the voiceless alveolar affricate
Voiceless alveolar affricate
The voiceless alveolar affricate is a type of consonantal sound, used in some spoken languages. The sound is transcribed in the International Phonetic Alphabet with ⟨⟩ or ⟨⟩ . The voiceless alveolar affricate occurs in such languages as German, Cantonese, Italian, Russian, Japanese and Mandarin...

 /ts/ in old Spanish and stems from the Visigothic
Visigothic script
Visigothic script was a type of medieval script that originated in the Visigothic kingdom in Hispania...

 form of the letter "z" , whose upper loop was lengthened and reinterpreted as a "c", whereas its lower loop became the diminished appendage, the cedilla.

It represents the "soft" sound /s/ where a "c" would normally represent the "hard" sound /k/ (before "a", "o", "u", or at the end of a word) in English and in certain Romance languages such as Catalan
Catalan language
Catalan is a Romance language, the national and only official language of Andorra and a co-official language in the Spanish autonomous communities of Catalonia, the Balearic Islands and Valencian Community, where it is known as Valencian , as well as in the city of Alghero, on the Italian island...

, French
French language
French is a Romance language spoken as a first language in France, the Romandy region in Switzerland, Wallonia and Brussels in Belgium, Monaco, the regions of Quebec and Acadia in Canada, and by various communities elsewhere. Second-language speakers of French are distributed throughout many parts...

, Occitan, and Portuguese
Portuguese language
Portuguese is a Romance language that arose in the medieval Kingdom of Galicia, nowadays Galicia and Northern Portugal. The southern part of the Kingdom of Galicia became independent as the County of Portugal in 1095...

. In Occitan, Friulian and Catalan ç can also be found at the beginning of a word (Çubran, ço) or at the end (braç).

It represents the voiceless postalveolar affricate
Voiceless postalveolar affricate
The voiceless palato-alveolar affricate or domed postalveolar affricate is a type of consonantal sound used in some spoken languages. The sound is transcribed in the International Phonetic Alphabet with ⟨⟩ or ⟨⟩...

 /tʃ/ (as in English "church") in Albanian
Albanian language
Albanian is an Indo-European language spoken by approximately 7.6 million people, primarily in Albania and Kosovo but also in other areas of the Balkans in which there is an Albanian population, including western Macedonia, southern Montenegro, southern Serbia and northwestern Greece...

, Azerbaijani
Azerbaijani language
Azerbaijani or Azeri or Torki is a language belonging to the Turkic language family, spoken in southwestern Asia by the Azerbaijani people, primarily in Azerbaijan and northwestern Iran...

, Friulian
Friulian language
Friulan , is a Romance language belonging to the Rhaeto-Romance family, spoken in the Friuli region of northeastern Italy. Friulan has around 800,000 speakers, the vast majority of whom also speak Italian...

, Kurdish
Kurdish language
Kurdish is a dialect continuum spoken by the Kurds in western Asia. It is part of the Iranian branch of the Indo-Iranian group of Indo-European languages....

, Tatar
Tatar language
The Tatar language , or more specifically Kazan Tatar, is a Turkic language spoken by the Tatars of historical Kazan Khanate, including modern Tatarstan and Bashkiria...

, Turkish
Turkish language
Turkish is a language spoken as a native language by over 83 million people worldwide, making it the most commonly spoken of the Turkic languages. Its speakers are located predominantly in Turkey and Northern Cyprus with smaller groups in Iraq, Greece, Bulgaria, the Republic of Macedonia, Kosovo,...

 (as in çiçek, çam, çekirdek, Çorum
Çorum
Çorum is a landlocked northern Anatolian city that is the capital of the Çorum Province of Turkey. Çorum is located inland in the central Black Sea Region of Turkey, and is approximately from Ankara and from Istanbul...

), and Turkmen
Turkmen language
Turkmen is the national language of Turkmenistan...

. It is also sometimes used this way in Manx
Manx language
Manx , also known as Manx Gaelic, and as the Manks language, is a Goidelic language of the Indo-European language family, historically spoken by the Manx people. Only a small minority of the Island's population is fluent in the language, but a larger minority has some knowledge of it...

, to distinguish it from the velar fricative
Velar fricative
Velar fricative can refer to*voiced velar fricative: in the International Phonetic Alphabet*voiceless velar fricative: in the International Phonetic Alphabet...

.

In the International Phonetic Alphabet
International Phonetic Alphabet
The International Phonetic Alphabet "The acronym 'IPA' strictly refers [...] to the 'International Phonetic Association'. But it is now such a common practice to use the acronym also to refer to the alphabet itself that resistance seems pedantic...

, /ç/ represents the voiceless palatal fricative
Voiceless palatal fricative
The voiceless palatal fricative is a type of consonantal sound used in some spoken languages. The symbol in the International Phonetic Alphabet that represents this sound is . The symbol ç is the letter c with a cedilla, as used to spell French words such as façade...

.

Use of the cedilla with the letter S

The character "ş" represents the voiceless postalveolar fricative
Voiceless postalveolar fricative
The voiceless palato-alveolar fricative or voiceless domed postalveolar fricative is a type of consonantal sound, used in many spoken languages, including English...

 /ʃ/ (as in "show") in several languages, including many belonging to the Turkic languages
Turkic languages
The Turkic languages constitute a language family of at least thirty five languages, spoken by Turkic peoples across a vast area from Eastern Europe and the Mediterranean to Siberia and Western China, and are considered to be part of the proposed Altaic language family.Turkic languages are spoken...

:
  • Turkish
    Turkish language
    Turkish is a language spoken as a native language by over 83 million people worldwide, making it the most commonly spoken of the Turkic languages. Its speakers are located predominantly in Turkey and Northern Cyprus with smaller groups in Iraq, Greece, Bulgaria, the Republic of Macedonia, Kosovo,...

     (It is included as a separate letter (Ş) in the Turkish alphabet
    Turkish alphabet
    The Turkish alphabet is a Latin alphabet used for writing the Turkish language, consisting of 29 letters, seven of which have been modified from their Latin originals for the phonetic requirements of the language. This alphabet represents modern Turkish pronunciation with a high degree of accuracy...

    .)
    • For example, it is used in Turkish words and names like Eskişehir
      Eskisehir
      Eskişehir is a city in northwestern Turkey and the capital of the Eskişehir Province. According to the 2009 census, the population of the city is 631,905. The city is located on the banks of the Porsuk River, 792 m above sea level, where it overlooks the fertile Phrygian Valley. In the nearby...

      , Şımarık
      Simarik
      Şımarık is a 1997 song by Turkish singer Tarkan. Its lyrics were written by Sezen Aksu, with music credited as composed by Tarkan, Aksu and Ozan Çolakoğlu at the time. However, Tarkan later admitted in a 2006 interview that this had been done without Aksu's consent, who was the true copyright owner...

      , Hasan Şaş
      Hasan Sas
      Hasan Gökhan Şaş is a former Turkish international football player, most commonly known for his time at Galatasaray.-Galatasaray:...

      , Rüştü Reçber
      Rüstü Reçber
      Rüştü Reçber is a Turkish international footballer who currently plays as a goalkeeper for Beşiktaş J.K. in the Turkish Süper Lig.He has played a key role in the success of the Turkish national squad, and it was at the 2002 FIFA World Cup, where Turkey finished third, that he earned a selection to...

       etc.

  • Azerbaijani
    Azerbaijani language
    Azerbaijani or Azeri or Torki is a language belonging to the Turkic language family, spoken in southwestern Asia by the Azerbaijani people, primarily in Azerbaijan and northwestern Iran...

  • Crimean Tatar
    Crimean Tatar language
    The Crimean Tatar language is the language of the Crimean Tatars. It is a Turkic language spoken in Crimea, Central Asia , and the Crimean Tatar diasporas in Turkey, Romania, Bulgaria...

  • Gagauz
    Gagauz language
    The Gagauz language is a Turkic language, spoken by the Gagauz people, and the official language of Gagauzia, Moldova. There are two dialects, Bulgar Gagauzi and Maritime Gagauzi. This is a different language from Balkan Gagauz Turkish....

  • Tatar
    Tatar language
    The Tatar language , or more specifically Kazan Tatar, is a Turkic language spoken by the Tatars of historical Kazan Khanate, including modern Tatarstan and Bashkiria...

  • Turkmen
    Turkmen language
    Turkmen is the national language of Turkmenistan...

  • Romanian
    Romanian language
    Romanian Romanian Romanian (or Daco-Romanian; obsolete spellings Rumanian, Roumanian; self-designation: română, limba română ("the Romanian language") or românește (lit. "in Romanian") is a Romance language spoken by around 24 to 28 million people, primarily in Romania and Moldova...

     (substitution use when S-comma
    S-comma
    S-comma is a letter which is part of the Romanian alphabet, used to represent the Romanian language sound , the voiceless postalveolar fricative ....

     [Ș] was missing from pre-3.0 Unicode
    Unicode
    Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...

     standards, and older standards, still frequent)
  • Kurdish
    Kurdish language
    Kurdish is a dialect continuum spoken by the Kurds in western Asia. It is part of the Iranian branch of the Indo-Iranian group of Indo-European languages....



It is also used in some Romanization
Romanization
In linguistics, romanization or latinization is the representation of a written word or spoken speech with the Roman script, or a system for doing so, where the original word or language uses a different writing system . Methods of romanization include transliteration, for representing written...

s of Arabic
Arabic language
Arabic is a name applied to the descendants of the Classical Arabic language of the 6th century AD, used most prominently in the Quran, the Islamic Holy Book...

, Persian
Persian language
Persian is an Iranian language within the Indo-Iranian branch of the Indo-European languages. It is primarily spoken in Iran, Afghanistan, Tajikistan and countries which historically came under Persian influence...

 and Tiberian Hebrew
Tiberian Hebrew
Tiberian Hebrew is the extinct canonical pronunciation of the Hebrew Bible or Tanakh and related documents in the Roman Empire. This traditional medieval pronunciation was committed to writing by Masoretic scholars based in the Jewish community of Tiberias , in the form of the Tiberian vocalization...

 to represent a variant of "s", (ص in the Arabic alphabet, צ in Hebrew) although the letter "" is more frequently used for this.
  • In HTML character entity references Ş and ş can be used.

Use of the cedilla in Latvian

Comparatively, some consider the diacritics on the Latvian
Latvian language
Latvian is the official state language of Latvia. It is also sometimes referred to as Lettish. There are about 1.4 million native Latvian speakers in Latvia and about 150,000 abroad. The Latvian language has a relatively large number of non-native speakers, atypical for a small language...

 consonants "ģ", "ķ", "ļ", "ņ", and formerly "ŗ" to be cedillas. Although their Adobe
Adobe Systems
Adobe Systems Incorporated is an American computer software company founded in 1982 and headquartered in San Jose, California, United States...

 glyph
Glyph
A glyph is an element of writing: an individual mark on a written medium that contributes to the meaning of what is written. A glyph is made up of one or more graphemes....

 names are commas, their names in the Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...

 Standard are "g", "k", "l", "n", and "r" with a cedilla. They were introduced to the Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...

 standard before 1992, and their name cannot be altered. The uppercase equivalent "Ģ" sometimes has a regular cedilla.

Use of the cedilla in Marshallese

Four letters
Marshallese orthography
Marshallese underwent a change of orthography in recent times. However, most people still use the old orthography. It is written in a form of the Latin script with unusual diacritic combinations. There are different alphabetic systems in use by Marshallese speakers depending on religious...

 in Marshallese
Marshallese language
The Marshallese language is a Malayo-Polynesian language of the Marshall Islands, and the principal language of the country...

 have cedillas: "", "", "" and "". In standard printed text they are always cedillas, and their omission or the substitution of comma below and dot below diacritics are nonstandard.

, many font rendering engines do not display any of these properly, for two reasons:
  • "" and "" usually do not display properly at all, because of the use of the cedilla in Latvian. Unicode has precombined glyphs for these letters, but most quality fonts display them with comma below diacritics to accommodate the expectations of Latvian orthography
    Latvian orthography
    Latvian orthography, historically, has used a system based upon German phonetic principles and the Latgalian dialect was written using Polish orthographic principles. The present-day orthography has been in use since 1908. Its basis is the Latin alphabet. For the most part it is phonetic in that it...

    . This is considered nonstandard in Marshallese. The use of a zero-width non-joiner
    Zero-width non-joiner
    The zero-width non-joiner is a non-printing character used in the computerization of writing systems that make use of ligatures. When placed between two characters that would otherwise be connected into a ligature, a ZWNJ causes them to be printed in their final and initial forms, respectively...

     between the letter and the diacritic can alleviate this problem: "" and "" may display properly, but may not; see below.
  • "" and "" do not currently exist in Unicode as precombined glyphs, and must be encoded as the plain Latin letters "" and "" with the combining cedilla diacritic. Most Unicode fonts issued with Windows do not display combining diacritics properly, showing them too far to the right of the letter, as with Tahoma
    Tahoma
    Tahoma is the original form of the word "Tacoma", as in the city of Tacoma, Washington. It can refer to:Places:* Mount Tahoma, an alternative spelling of Mount Tacoma, the original name of Mount Rainier in the Cascade Range...

     ("" and "") and Times New Roman ("" and ""). This mostly affects "", and may or may not affect "". But some common Unicode fonts like Arial Unicode MS
    Arial Unicode MS
    In digital typography, the TrueType font Arial Unicode MS is an extended version of the font Arial. Compared to Arial, it includes higher line height, omits kerning pairs and adds enough glyphs to cover a large subset of Unicode 2.1—thus supporting most Microsoft code pages, but also requiring much...

     ("" and ""), Cambria
    Cambria (typeface)
    Cambria is part of the suite of fonts that comes with Microsoft Windows Vista, Windows 7, Microsoft Office 2007, Microsoft Office 2008 for Mac and Microsoft Office 2011 for Mac, specifically designed for on-screen reading and to be aesthetically pleasing when printed at small sizes. It is a...

     ("" and "") and Lucida Sans Unicode
    Lucida Sans Unicode
    In digital typography, Lucida Sans Unicode OpenType font from the design studio of Bigelow & Holmes is designed to support the most commonly used characters defined in version 2.0 of the Unicode standard...

     ("" and "") don't have this problem. When "" is properly displayed, the cedilla is either underneath the center of the letter, or is underneath the right-most leg of the letter, but is always directly underneath the letter wherever it's positioned.


Because of these font display issues, it is not uncommon to find nonstandard ad hoc substitutes for these letters. The Marshallese-English Dictionary (the only complete Marshallese dictionary in existence) displays the letters with dot below diacritics, all of which do exist as precombined glyphs in Unicode: "", "", "" and "". The first three exist in the International Alphabet of Sanskrit Transliteration, and "" exists in the Vietnamese alphabet
Vietnamese alphabet
The Vietnamese alphabet, called Chữ Quốc Ngữ , usually shortened to Quốc Ngữ , is the modern writing system for the Vietnamese language...

, and both of these systems are supported by the most recent versions of common fonts like Arial
Arial
Arial, sometimes marketed or displayed in software as Arial MT, is a sans-serif typeface and set of computer fonts. Fonts from the Arial family are packaged with Microsoft Windows, some other Microsoft software applications, Apple Mac OS X and many PostScript 3 computer printers...

, Courier New, Tahoma
Tahoma
Tahoma is the original form of the word "Tacoma", as in the city of Tacoma, Washington. It can refer to:Places:* Mount Tahoma, an alternative spelling of Mount Tacoma, the original name of Mount Rainier in the Cascade Range...

 and Times New Roman. This sidesteps most of the Marshallese text display issues associated with the cedilla, but is still inappropriate for polished standard text.

Other diacritics confused with the cedilla

Several languages add a comma (virgula) to some letters, such as , which looks like a cedilla, but is more precisely a diacritical comma. This is particularly confusing with letters which can take either diacritic: for example, the consonant /ʃ/ is written as "ş" in Turkish
Turkish language
Turkish is a language spoken as a native language by over 83 million people worldwide, making it the most commonly spoken of the Turkic languages. Its speakers are located predominantly in Turkey and Northern Cyprus with smaller groups in Iraq, Greece, Bulgaria, the Republic of Macedonia, Kosovo,...

 but in Romanian
Romanian language
Romanian Romanian Romanian (or Daco-Romanian; obsolete spellings Rumanian, Roumanian; self-designation: română, limba română ("the Romanian language") or românește (lit. "in Romanian") is a Romance language spoken by around 24 to 28 million people, primarily in Romania and Moldova...

, and Romanian writers will sometimes use the former instead of the latter because of insufficient font or character-set support.

The Polish
Polish language
Polish is a language of the Lechitic subgroup of West Slavic languages, used throughout Poland and by Polish minorities in other countries...

 letters "ą" and "ę" and Lithuanian
Lithuanian language
Lithuanian is the official state language of Lithuania and is recognized as one of the official languages of the European Union. There are about 2.96 million native Lithuanian speakers in Lithuania and about 170,000 abroad. Lithuanian is a Baltic language, closely related to Latvian, although they...

 letters "ą", "ę", "į", and "ų" are not made with the cedilla either, but with the unrelated ogonek
Ogonek
The ogonek is a diacritic hook placed under the lower right corner of a vowel in the Latin alphabet used in several European and Native American languages.-Use:...

 diacritic, which merely resembles a reversed cedilla (opening to the right instead of the left).

A proposal for the use of the cedilla with the letter T in French

In 1868, Ambroise Firmin-Didot suggested in his book Observations sur l'orthographe, ou ortografie, française (Observations on French Spelling) that French phonetics could be better regularized by adding a cedilla beneath the letter "t" in some words. For example, it is well-known that in the suffix -tion this letter is usually not pronounced as (or close to) /t/ in either French or English. It has to be distinctly learned that in words such as French diplomatie (but not diplomatique) and English action it is pronounced /s/ and /ʃ/, respectively (but not in active in either language). A similar effect occurs with other prefixes or within words also in French and English, such as partial where t is pronounced /s/ and /ʃ/ respectively. Firmin-Didot surmised that a new character could be added to French orthography. A similar letter, the t-comma
T-comma
T-comma is a letter which is part of the Romanian alphabet, used to represent the Romanian language sound , the voiceless alveolar affricate . It is written as the letter T with a small comma below and it has both the lower-case and the upper-case variants...

, does exist in Romanian, but it has a comma accent, not a cedilla one.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK