Schwa deletion in Indo-Aryan languages
Encyclopedia
The schwa deletion or schwa syncope phenomenon plays a crucial role in Hindi
, Marathi
, Urdu
, Kashmiri
, Punjabi
, Gujarati
, Maithili
and several other Indo-Aryan languages
, where schwa
s implicit in the written scripts of those languages are obligatorily deleted for correct pronunciation. Schwa syncope is extremely important in these languages for intelligibility and unaccented speech. It also presents a challenge to non-native speakers and speech synthesis
software because the scripts, including Devanagari
, do not provide indicators of where schwas should be dropped.
software for Hindi.
As a result of schwa syncope, the Hindi pronunciation of many words differs from that expected from a literal Sanskrit-style rendering of Devanagari. For instance, राम is Rām (not Rāma), रचना is Rachnā (not Rachanā), वेद is Véd (not Véda) and नमकीन is Namkeen (not Namakeen). The name of the script itself is pronounced Devnāgrī (not Devanāgarī).
Correct schwa deletion is also critical because, in some cases, the same Devanagari letter-sequence is pronounced two different ways in Hindi depending on context, and failure to delete the appropriate schwas can change the sense of the word. For instance, the letter sequence 'रक' is pronounced differently in हरकत (har.kat, meaning movement or activity) and सरकना (sarak.na, meaning to slide). Similarly, the sequence धड़कने in दिल धड़कने लगा (the heart started beating) and in दिल की धड़कनें (beats of the heart) is identical prior to the nasalization in the second usage. Yet, it is pronounced dhadak.ne in the first and dhad.kane in the second. While native speakers correctly pronounce the sequences differently in different contexts, non-native speakers and voice-synthesis software can make them "sound very unnatural", making it "extremely difficult for the listener" to grasp the intended meaning.
in which हमरा (meaning mine) is pronounced həmrā rather than həmərā due to the deletion of a medial schwa. In the Dardic subbranch
of Indo-Aryan, Kashmiri similarly demonstrates schwa deletion. For instance, drāksha (द्राक्ष, drākshə) is the Sanskrit word for grape, but the final schwa is dropped in the Kashmiri version, which is dāch. Punjabi, too, has broad schwa deletion rules: several base word forms (e.g. ਕਾਗ਼ਜ਼, کاغز, kāghəz/paper) drop schwas in the plural form (ਕਾਗ਼ਜ਼ਾੰ, کاغزاں, kāghzāṅ/papers) as well as with instrumental (ਕਾਗ਼ਜ਼ੋੰ, کاغزوں, kāghzōṅ/from the paper) and locative (ਕਾਗ਼ਜ਼ੇ, کاغزے, kāghzé/on the paper) suffixes.
Different Indo-Aryan languages can differ in how they apply schwa deletion. For instance, medial schwas from Sanskrit-origin words are often retained in Bengali where they are deleted in Hindi. An example of this is रचना/রচনা which is pronounced rachana (/rətʃənaː/) in Sanskrit, rachna (/rətʃnaː/) in Hindi and rochona (/rɔtʃonaː/) in Bengali - while the medial schwa is deleted in Hindi (due to the ə -> ø | VC_CV rule), it is retained in Bengali. On the other hand, the final schwa in वेद/বেদ is deleted in both Hindi and Bengali (Sanskrit: /veːd̪ə/, Hindi: /veːd̪/, Bengali: /beːd̪/).
from Devanagari to Latin
and other scripts by hardcoding implicit schwas in every consonant often make systematic errors. This becomes evident when English
words are transliterated into Devanagari by Hindi-speakers and then translated back into English by manual or automated processes that don't account for Hindi's schwa deletion rules. For instance, English is transcoded by Hindi speakers into इंगलिश (rather than इंग्लिश्) which may be transcoded back to Ingalisha by automated systems, whereas schwa deletion would result in इंगलिश being correctly pronounced as Inglish by native Hindi-speakers.
Some common examples of errors are shown below -
Hindi
Standard Hindi, or more precisely Modern Standard Hindi, also known as Manak Hindi , High Hindi, Nagari Hindi, and Literary Hindi, is a standardized and sanskritized register of the Hindustani language derived from the Khariboli dialect of Delhi...
, Marathi
Marathi language
Marathi is an Indo-Aryan language spoken by the Marathi people of western and central India. It is the official language of the state of Maharashtra. There are over 68 million fluent speakers worldwide. Marathi has the fourth largest number of native speakers in India and is the fifteenth most...
, Urdu
Urdu
Urdu is a register of the Hindustani language that is identified with Muslims in South Asia. It belongs to the Indo-European family. Urdu is the national language and lingua franca of Pakistan. It is also widely spoken in some regions of India, where it is one of the 22 scheduled languages and an...
, Kashmiri
Kashmiri language
Kashmiri is a language from the Dardic sub-group and it is spoken primarily in the Kashmir Valley, in Jammu and Kashmir. There are approximately 5,554,496 speakers in Jammu and Kashmir, according to the Census of 2001. Most of the 105,000 speakers or so in Pakistan are émigrés from the Kashmir...
, Punjabi
Punjabi language
Punjabi is an Indo-Aryan language spoken by inhabitants of the historical Punjab region . For Sikhs, the Punjabi language stands as the official language in which all ceremonies take place. In Pakistan, Punjabi is the most widely spoken language...
, Gujarati
Gujarati language
Gujarati is an Indo-Aryan language, and part of the greater Indo-European language family. It is derived from a language called Old Gujarati which is the ancestor language of the modern Gujarati and Rajasthani languages...
, Maithili
Maithili language
Maithili language is spoken in the eastern region of India and South-eastern region of Nepal. The native speakers of Maithili reside in Bihar, Jharkhand,parts of West Bengal and South-east Nepal...
and several other Indo-Aryan languages
Indo-Aryan languages
The Indo-Aryan languages constitutes a branch of the Indo-Iranian languages, itself a branch of the Indo-European language family...
, where schwa
Schwa
In linguistics, specifically phonetics and phonology, schwa can mean the following:*An unstressed and toneless neutral vowel sound in some languages, often but not necessarily a mid-central vowel...
s implicit in the written scripts of those languages are obligatorily deleted for correct pronunciation. Schwa syncope is extremely important in these languages for intelligibility and unaccented speech. It also presents a challenge to non-native speakers and speech synthesis
Speech synthesis
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware...
software because the scripts, including Devanagari
Devanagari
Devanagari |deva]]" and "nāgarī" ), also called Nagari , is an abugida alphabet of India and Nepal...
, do not provide indicators of where schwas should be dropped.
Schwa deletion in Hindi
Although the Devanagari script is used as a standard to write modern Hindi, the schwa ('ə') implicit in each consonant of the script is "obligatorily deleted" at the end of words and in certain other contexts, unlike in Sanskrit. This phenomenon has been termed the "schwa syncope rule" or the "schwa deletion rule" of Hindi. One formalization of this rule has been summarized as ə -> ø | VC_CV. In other words, when a vowel-preceded consonant is followed by a vowel-succeeded consonant, the schwa inherent in the first consonant is deleted. However, this formalization is inexact and incomplete (i.e. sometimes deletes a schwa when it shouldn't or, at other times, fails to delete it when it should), and can yield errors. The rule is reported to result in correct predictions on schwa deletion 89% of the time. Schwa deletion is computationally important because it is essential to building text-to-speechSpeech synthesis
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware...
software for Hindi.
As a result of schwa syncope, the Hindi pronunciation of many words differs from that expected from a literal Sanskrit-style rendering of Devanagari. For instance, राम is Rām (not Rāma), रचना is Rachnā (not Rachanā), वेद is Véd (not Véda) and नमकीन is Namkeen (not Namakeen). The name of the script itself is pronounced Devnāgrī (not Devanāgarī).
Correct schwa deletion is also critical because, in some cases, the same Devanagari letter-sequence is pronounced two different ways in Hindi depending on context, and failure to delete the appropriate schwas can change the sense of the word. For instance, the letter sequence 'रक' is pronounced differently in हरकत (har.kat, meaning movement or activity) and सरकना (sarak.na, meaning to slide). Similarly, the sequence धड़कने in दिल धड़कने लगा (the heart started beating) and in दिल की धड़कनें (beats of the heart) is identical prior to the nasalization in the second usage. Yet, it is pronounced dhadak.ne in the first and dhad.kane in the second. While native speakers correctly pronounce the sequences differently in different contexts, non-native speakers and voice-synthesis software can make them "sound very unnatural", making it "extremely difficult for the listener" to grasp the intended meaning.
Schwa deletion in other Indo-Aryan languages
Gujarati has a strong schwa deletion phenomenon, affecting both medial and final schwas. From an evolutionary perspective, the final schwas appear to have been lost prior to the medial ones. Maithili's schwa deletion is similar to Hindi-Urdu, and the ə -> ø | VC_CV also selectively applies to the language. For instance, हमरो, which with schwas is həməro (meaning, even ours) is correctly pronounced həmro. This is akin to the neighboring BhojpuriBhojpuri language
Bhojpuri is a language spoken in parts of north-central and eastern India. It is spoken in the western part of state of Bihar, the northwestern part of Jharkhand, and the Purvanchal region of Uttar Pradesh , as well as adjoining parts of the Nepal Terai. Bhojpuri is also spoken in Guyana,...
in which हमरा (meaning mine) is pronounced həmrā rather than həmərā due to the deletion of a medial schwa. In the Dardic subbranch
Dardic languages
The Dardic languages are a sub-group of the Indo-Aryan languages spoken in northern Pakistan, eastern Afghanistan, and the Indian region of Jammu and Kashmir...
of Indo-Aryan, Kashmiri similarly demonstrates schwa deletion. For instance, drāksha (द्राक्ष, drākshə) is the Sanskrit word for grape, but the final schwa is dropped in the Kashmiri version, which is dāch. Punjabi, too, has broad schwa deletion rules: several base word forms (e.g. ਕਾਗ਼ਜ਼, کاغز, kāghəz/paper) drop schwas in the plural form (ਕਾਗ਼ਜ਼ਾੰ, کاغزاں, kāghzāṅ/papers) as well as with instrumental (ਕਾਗ਼ਜ਼ੋੰ, کاغزوں, kāghzōṅ/from the paper) and locative (ਕਾਗ਼ਜ਼ੇ, کاغزے, kāghzé/on the paper) suffixes.
Different Indo-Aryan languages can differ in how they apply schwa deletion. For instance, medial schwas from Sanskrit-origin words are often retained in Bengali where they are deleted in Hindi. An example of this is रचना/রচনা which is pronounced rachana (/rətʃənaː/) in Sanskrit, rachna (/rətʃnaː/) in Hindi and rochona (/rɔtʃonaː/) in Bengali - while the medial schwa is deleted in Hindi (due to the ə -> ø | VC_CV rule), it is retained in Bengali. On the other hand, the final schwa in वेद/বেদ is deleted in both Hindi and Bengali (Sanskrit: /veːd̪ə/, Hindi: /veːd̪/, Bengali: /beːd̪/).
Common transcription and diction errors
Since Devanagari does not provide indications of where schwas should be deleted, it is common for non-native learners/speakers of Hindi, who are otherwise familiar with Devanagari and Sanskrit, to incorrectly pronounce several words in Hindi-Urdu and other modern Indo-Aryan languages. Similarly, systems that automate transliterationTransliteration
Transliteration is a subset of the science of hermeneutics. It is a form of translation, and is the practice of converting a text from one script into another...
from Devanagari to Latin
Latin alphabet
The Latin alphabet, also called the Roman alphabet, is the most recognized alphabet used in the world today. It evolved from a western variety of the Greek alphabet called the Cumaean alphabet, which was adopted and modified by the Etruscans who ruled early Rome...
and other scripts by hardcoding implicit schwas in every consonant often make systematic errors. This becomes evident when English
English language
English is a West Germanic language that arose in the Anglo-Saxon kingdoms of England and spread into what was to become south-east Scotland under the influence of the Anglian medieval kingdom of Northumbria...
words are transliterated into Devanagari by Hindi-speakers and then translated back into English by manual or automated processes that don't account for Hindi's schwa deletion rules. For instance, English is transcoded by Hindi speakers into इंगलिश (rather than इंग्लिश्) which may be transcoded back to Ingalisha by automated systems, whereas schwa deletion would result in इंगलिश being correctly pronounced as Inglish by native Hindi-speakers.
Some common examples of errors are shown below -
Word in Devnagri and meaning | Correct pronunciation with schwa syncope | Incorrect pronunciation(s) | Comments |
---|---|---|---|
लपट (flame) | ləpəṭ | ləpəṭə | The final schwa must be deleted |
लपटें (flames) | ləpṭeṅ | ləpəṭeṅ | The medial schwa, ləpəṭ, which was retained in लपट, must be deleted in लपटें |
समझ (understanding) | səməjh | səməjhə | The final schwa must be deleted |
समझा (understood, verb masc.) | səmjhā | səməjhā | The medial vowel also needs to be deleted here, which it did not need to be in समझ |
भारत (India) | bhārət | bhārətə | Final schwa must be deleted |
भारतीय (Indian) | bhārtīy | bhārətīyə | Both the medial and final schwa should be deleted, although the final schwa is sometimes faintly pronounced due to the 'y' glide; when pronounced without this, the word sounds close to 'bhārtī |
देवनागरी (Devanagari, the script) | devnāgrī | devənāgərī | Two medial schwas (after व and after ग) should be deleted |
इंगलिश (English, the language) | inglish | ingəlishə | Medial and final schwas (after ग and after श) should be deleted |
विमला (Vimla, a proper name) | vimlā | viməlā | Medial schwa should be deleted |
सुलोचना (Sulochna, a proper name) | sulochnā | sulochənā | Medial schwa should be deleted |
Vowel nasalization
With some words that contain /n/ or /m/ consonants separated from succeeding consonants by schwas, the schwa deletion process has the effect of nasalizing any preceding vowels. Some examples in Hindi-Urdu include -- sən.kī (सनकी, سنکی, whimsical), in which a deleted schwa that is pronounced in the root word sənək (सनक, سنک, whimsy) converts the first medial schwa into a nasalized vowel
- chəm.kīlā (चमकीला, چمکیلا, shiny), in which a deleted schwa that is pronounced in the root word chəmək (चमक, چمک, shine) converts the first medial schwa into a nasalized vowel