Balkan linguistic union
Encyclopedia
The Balkan sprachbund
Sprachbund
A Sprachbund – also known as a linguistic area, convergence area, diffusion area or language crossroads – is a group of languages that have become similar in some way because of geographical proximity and language contact. They may be genetically unrelated, or only distantly related...

 or linguistic area is the ensemble of areal features—similarity in grammar, syntax, vocabulary and phonology—among the languages of the Balkans
Balkans
The Balkans is a geopolitical and cultural region of southeastern Europe...

. Several features are found across these languages though not all need apply to every single language. The languages in question may be wholly unrelated as modern forms in that they belong to various branches of Indo-European
Indo-European languages
The Indo-European languages are a family of several hundred related languages and dialects, including most major current languages of Europe, the Iranian plateau, and South Asia and also historically predominant in Anatolia...

 (such as Slavic
Slavic languages
The Slavic languages , a group of closely related languages of the Slavic peoples and a subgroup of Indo-European languages, have speakers in most of Eastern Europe, in much of the Balkans, in parts of Central Europe, and in the northern part of Asia.-Branches:Scholars traditionally divide Slavic...

, Greek
Hellenic languages
Hellenic, as a technical term in historical linguistics, is the branch of the Indo-European language family that includes Greek . According to most traditional classifications, Hellenic contains only Greek as a single language alone in its branch, and is as such co-extensive with "Greek"...

, Romance
Romance languages
The Romance languages are a branch of the Indo-European language family, more precisely of the Italic languages subfamily, comprising all the languages that descend from Vulgar Latin, the language of ancient Rome...

, Albanian
Albanian language
Albanian is an Indo-European language spoken by approximately 7.6 million people, primarily in Albania and Kosovo but also in other areas of the Balkans in which there is an Albanian population, including western Macedonia, southern Montenegro, southern Serbia and northwestern Greece...

 and Indo-Aryan
Indo-Aryan languages
The Indo-Aryan languages constitutes a branch of the Indo-Iranian languages, itself a branch of the Indo-European language family...

) and also outside of Indo-European (such as Turkish
Turkish language
Turkish is a language spoken as a native language by over 83 million people worldwide, making it the most commonly spoken of the Turkic languages. Its speakers are located predominantly in Turkey and Northern Cyprus with smaller groups in Iraq, Greece, Bulgaria, the Republic of Macedonia, Kosovo,...

). Also interesting is that some of the languages use these features for their standard language (ie. those whose homeland lies almost entirely within the region) whilst other populations to whom the land is not a cultural pivot (as they have wider communities outside of it) may still adopt the features for their local register; this is turn is viewed as non-standard by their respective peoples away from the region.

While they share little vocabulary, their grammars have very extensive similarities; for example they have similar case
Grammatical case
In grammar, the case of a noun or pronoun is an inflectional form that indicates its grammatical function in a phrase, clause, or sentence. For example, a pronoun may play the role of subject , of direct object , or of possessor...

 systems and verb conjugation systems and have all become more analytic, although to differing degrees.

History

The earliest scholar to notice the similarities between Balkan languages belonging to different families was the Slovenia
Slovenia
Slovenia , officially the Republic of Slovenia , is a country in Central and Southeastern Europe touching the Alps and bordering the Mediterranean. Slovenia borders Italy to the west, Croatia to the south and east, Hungary to the northeast, and Austria to the north, and also has a small portion of...

n scholar Jernej Kopitar
Jernej Kopitar
Jernej Bartol Kopitar was a Slovene linguist and philologist working in Vienna. He also worked as the Imperial censor for Slovene literature in Vienna...

 in 1829. August Schleicher
August Schleicher
August Schleicher was a German linguist. His great work was A Compendium of the Comparative Grammar of the Indo-European Languages, in which he attempted to reconstruct the Proto-Indo-European language...

 (1850) more explicitly developed the concept of areal relationships as opposed to genetic ones, and Franc Miklošič
Franc Miklošic
Fran Miklošič , was a Slovene philologist.-Biography:Miklošič was born in the small village of Radomerščak near the Lower Styrian town of Ljutomer, then part of the Austrian Empire....

 (1861) studied the relationships of Balkan Slavic and Romance more extensively.

Nikolai Trubetzkoy
Nikolai Trubetzkoy
Prince Nikolai Sergeyevich Trubetzkoy was a Russian linguist and historian whose teachings formed a nucleus of the Prague School of structural linguistics. He is widely considered to be the founder of morphophonology...

 (1923), Kristian Sandfeld-Jensen (1930), and Gustav Weigand
Gustav Weigand
Gustav Weigand , was a German linguist and specialist in Balkan languages, especially Rumanian and Aromanian. He is known for his seminal contributions to the dialectology of the Romance languages of the Balkans and to the study of the relationships between the languages of the Balkan...

 (1925) developed the theory in the 1920s and 1930s.

In the 1930s, the Romanian linguist Alexandru Graur
Alexandru Graur
Alexandru Graur was a Romanian linguist.Born into a Jewish family in Botoşani, Graur graduated from the Faculty of Letters of the University of Bucharest and the École Pratique des Hautes Études in Paris . He obtained a Doctor of Philosophy degree from the Sorbonne...

 criticized the notion of “Balkan linguistics,” saying that one can talk about “relationships of borrowings, of influences, but not about Balkan linguistics”.

The term "Balkan linguistic union" was coined by the Romanian linguist Alexandru Rosetti in 1958, when he claimed that the shared features conferred the Balkan languages a special similarity. Theodor Capidan went further, claiming that the structure of Balkan languages could be reduced to a standard language. Many of the earliest reports on this theory were in German
German language
German is a West Germanic language, related to and classified alongside English and Dutch. With an estimated 90 – 98 million native speakers, German is one of the world's major languages and is the most widely-spoken first language in the European Union....

, hence the term "Balkansprachbund" is often used as well.

Languages

The languages that share these similarities belong to five distinct branches of the Indo-European languages:
  • Albanian
    Albanian language
    Albanian is an Indo-European language spoken by approximately 7.6 million people, primarily in Albania and Kosovo but also in other areas of the Balkans in which there is an Albanian population, including western Macedonia, southern Montenegro, southern Serbia and northwestern Greece...

  • Hellenic
    Hellenic languages
    Hellenic, as a technical term in historical linguistics, is the branch of the Indo-European language family that includes Greek . According to most traditional classifications, Hellenic contains only Greek as a single language alone in its branch, and is as such co-extensive with "Greek"...

     (Greek
    Greek language
    Greek is an independent branch of the Indo-European family of languages. Native to the southern Balkans, it has the longest documented history of any Indo-European language, spanning 34 centuries of written records. Its writing system has been the Greek alphabet for the majority of its history;...

    )
  • Romance languages
    Romance languages
    The Romance languages are a branch of the Indo-European language family, more precisely of the Italic languages subfamily, comprising all the languages that descend from Vulgar Latin, the language of ancient Rome...

     (Romanian
    Romanian language
    Romanian Romanian Romanian (or Daco-Romanian; obsolete spellings Rumanian, Roumanian; self-designation: română, limba română ("the Romanian language") or românește (lit. "in Romanian") is a Romance language spoken by around 24 to 28 million people, primarily in Romania and Moldova...

    , Aromanian
    Aromanian language
    Aromanian , also known as Macedo-Romanian, Arumanian or Vlach is an Eastern Romance language spoken in Southeastern Europe...

    , Megleno-Romanian
    Megleno-Romanian language
    Megleno-Romanian is a Romance language, similar to Aromanian and Romanian, or a dialect of the Romanian language...

     and Istro-Romanian
    Istro-Romanian language
    Istro-Romanian is an Eastern Romance language that is still spoken today in a few villages and hamlets in the peninsula of Istria, on the northern part of the Adriatic Sea, in what is now Croatia as well as in other countries around the world where the Istro-Romanian people settled after the two...

    )
  • Slavic languages
    Slavic languages
    The Slavic languages , a group of closely related languages of the Slavic peoples and a subgroup of Indo-European languages, have speakers in most of Eastern Europe, in much of the Balkans, in parts of Central Europe, and in the northern part of Asia.-Branches:Scholars traditionally divide Slavic...

     (Bulgarian
    Bulgarian language
    Bulgarian is an Indo-European language, a member of the Slavic linguistic group.Bulgarian, along with the closely related Macedonian language, demonstrates several linguistic characteristics that set it apart from all other Slavic languages such as the elimination of case declension, the...

    , Macedonian
    Macedonian language
    Macedonian is a South Slavic language spoken as a first language by approximately 2–3 million people principally in the region of Macedonia but also in the Macedonian diaspora...

    , Serbian
    Serbian language
    Serbian is a form of Serbo-Croatian, a South Slavic language, spoken by Serbs in Serbia, Bosnia and Herzegovina, Montenegro, Croatia and neighbouring countries....

     — especially the Torlakian dialect
    Torlakian dialect
    Torlakian or Torlak is a name given to the group of South Slavic dialects of southeastern Serbia , northeastern Macedonia , western Bulgaria , which is intermediate between Serbian, Bulgarian and Macedonian.Some linguists classify it as an Old-Shtokavian dialect of Serbian or a fourth dialect of...

     which is transitional between Macedonian, Bulgarian and Serbian)
  • Indo-Aryan
    Indo-Aryan languages
    The Indo-Aryan languages constitutes a branch of the Indo-Iranian languages, itself a branch of the Indo-European language family...

     (Roma
    Romani language
    Romani or Romany, Gypsy or Gipsy is any of several languages of the Romani people. They are Indic, sometimes classified in the "Central" or "Northwestern" zone, and sometimes treated as a branch of their own....

    /Gypsy & Balkan "Egyptian")


However, not all of these languages have the same number of features shared. That is why they are divided into three groups:
  1. Albanian
    Albanian language
    Albanian is an Indo-European language spoken by approximately 7.6 million people, primarily in Albania and Kosovo but also in other areas of the Balkans in which there is an Albanian population, including western Macedonia, southern Montenegro, southern Serbia and northwestern Greece...

    , Romanian
    Romanian language
    Romanian Romanian Romanian (or Daco-Romanian; obsolete spellings Rumanian, Roumanian; self-designation: română, limba română ("the Romanian language") or românește (lit. "in Romanian") is a Romance language spoken by around 24 to 28 million people, primarily in Romania and Moldova...

    , Macedonian
    Macedonian language
    Macedonian is a South Slavic language spoken as a first language by approximately 2–3 million people principally in the region of Macedonia but also in the Macedonian diaspora...

    , Aromanian
    Aromanian language
    Aromanian , also known as Macedo-Romanian, Arumanian or Vlach is an Eastern Romance language spoken in Southeastern Europe...

     and Bulgarian
    Bulgarian language
    Bulgarian is an Indo-European language, a member of the Slavic linguistic group.Bulgarian, along with the closely related Macedonian language, demonstrates several linguistic characteristics that set it apart from all other Slavic languages such as the elimination of case declension, the...

     have the most properties in common
  2. Serbian language
    Serbian language
    Serbian is a form of Serbo-Croatian, a South Slavic language, spoken by Serbs in Serbia, Bosnia and Herzegovina, Montenegro, Croatia and neighbouring countries....

     (especially transitional Torlak dialect) and Greek
    Greek language
    Greek is an independent branch of the Indo-European family of languages. Native to the southern Balkans, it has the longest documented history of any Indo-European language, spanning 34 centuries of written records. Its writing system has been the Greek alphabet for the majority of its history;...

     share with the others a lower number of properties
  3. Turkish
    Turkish language
    Turkish is a language spoken as a native language by over 83 million people worldwide, making it the most commonly spoken of the Turkic languages. Its speakers are located predominantly in Turkey and Northern Cyprus with smaller groups in Iraq, Greece, Bulgaria, the Republic of Macedonia, Kosovo,...

     - shares mainly vocabulary and replacement of infinitive with subjunctive.


The Finnish linguist Jouko Lindstedt computed in 2000 a "Balkanization factor" which gives each Balkan language a score proportional with the number of features shared in the Balkan linguistic union. The results were:
Language Score
Macedonian 12
Balkan Slavic 11.5
Albanian 10.5
Greek, Balkan Romance 9.5
Romani (Gypsy) 7.5


Another language that may have been influenced by the Balkan language union is the Judeo-Spanish variant that used to be spoken by Sephardi Jews living in the Balkans. The grammatical features shared (especially regarding the tense system) were most likely borrowed from Greek.

Origins

The source of these features as well as the directions have long been debated, and various theories were suggested.

Thracian, Illyrian or Dacian

Since most of these features cannot be found in languages related to those that belong to the linguistic union (such as other Slavic or Romance languages), early researchers, including Kopitar, believed they must have been inherited from the Paleo-Balkan languages
Paleo-Balkan languages
Paleo-Balkan is a geolinguistic term referring to the Indo-European languages that were spoken in the Balkans in ancient times. Except for Greek and the language that gave rise to Albanian , they are all extinct, due to Hellenization, Romanization, and Slavicisation.- Classification :The following...

 (Illyrian, Thracian
Thracian language
The Thracian language was the Indo-European language spoken in ancient times in Southeastern Europe by the Thracians, the northern neighbors of the Ancient Greeks. The Thracian language exhibits satemization: it either belonged to the Satem group of Indo-European languages or it was strongly...

 and Dacian
Dacian language
The extinct Dacian language may have developed from proto-Indo-European in the Carpathian region around 2,500 BC and probably died out by AD 600. In the 1st century AD, it was the predominant language of the ancient regions of Dacia and Moesia and, possibly, of some surrounding regions.It belonged...

) which formed the substrate
Substratum
In linguistics, a stratum or strate is a language that influences, or is influenced by another through contact. A substratum is a language which has lower power or prestige than another, while a superstratum is the language that has higher power or prestige. Both substratum and superstratum...

 for modern Balkan languages. But since very little is known about Paleo-Balkan languages, it cannot be determined whether the features were present. The strongest candidate for a shared Paleo-Balkan feature is the postposed article.

Greek

Another theory, advanced by Kristian Sandfeld in 1930, was that these features were an entirely Greek influence, under the presumption that since Greece "always had a superior civilization compared to its neighbours", Greek could not have borrowed its linguistic features from them. However, no ancient dialects of Greek possessed Balkanisms, so that the features shared with other regional languages appear to be post-classical innovations. Also, Greek appears to be only peripheral to the Balkan linguistic union, lacking some important features, such as the postposed article. Nevertheless, several of the features that Greek does share with the other languages (loss of dative, replacement of infinitive by subjunctive constructions, object clitics, formation of future with auxiliary verb "to want") probably originated in Medieval Greek
Medieval Greek
Medieval Greek, also known as Byzantine Greek, is the stage of the Greek language between the beginning of the Middle Ages around 600 and the Ottoman conquest of the city of Constantinople in 1453. The latter date marked the end of the Middle Ages in Southeast Europe...

 and spread to the other languages through Byzantine influence.

Latin and Romance

The Roman Empire ruled all the Balkans, and local variation of Latin may have left its mark on all languages there, which were later the substrate to Slavic newcomers. This was proposed by Georg Solta. The weak point of this theory is that other Romance languages have few of the features, and there is no proof that the Balkan Romans were isolated for enough time to develop them. An argument for this would be the structural borrowings or "linguistic calque
Calque
In linguistics, a calque or loan translation is a word or phrase borrowed from another language by literal, word-for-word or root-for-root translation.-Calque:...

s" into Macedonian from Aromanian, which could be explained by Aromanian being a substrate of Macedonian, but this still does not explain the origin of these innovations in Aromanian. The analytic perfect with the auxiliary verb "to have" (which Balkan languages share with Western European languages), is the only feature whose origin can fairly safely be traced to Latin.

Multiple sources

The most commonly accepted theory, advanced by Polish scholar Zbigniew Gołąb, is that the innovations came from different sources and the languages influenced each other: some features can be traced from Latin, Slavic or Greek languages, while others, particularly features that are shared only by Romanian, Albanian, Macedonian and Bulgarian, could be explained by the substratum kept after Romanization (in the case of Romanian) or Slavicization (in the case of Bulgarian). Albanian was influenced by both Latin and Slavic, but it kept many of its original characteristics.

Several arguments favour this theory. First, throughout the turbulent history of the Balkans
History of the Balkans
The Balkans is an area of southeastern Europe situated at a major crossroads between mainland Europe and the Near East. The distinct identity and fragmentation of the Balkans owes much to its common and often violent history and to its very mountainous geography.-Neolithic:Archaeologists have...

, many groups of people moved to another place, inhabited by people of another ethnicity. These small groups were usually assimilated quickly and sometimes left marks in the new language they acquired. Second, the use of more than one language was common in the Balkans before the modern age, and a drift
Drift (linguistics)
There are two types of linguistic drift, a unidirectional short-term and cyclic long-term drift.-Short-term unidirectional drift:According to Sapir, drift is the unconscious change in natural language...

 in one language would quickly spread to other languages. Third, the dialects that have the most "balkanisms" are those in regions where people had contact with people of many other languages.

(Old) Albanian

According to the central hypothesis of a project undertaken by the Austrian Science Fund FWF, Old Albanian had a significant influence on the development of many Balkan languages. Intensive research now aims to confirm this theory. This little-known language is being researched using all available texts before a comparison with other Balkan languages is carried out. The outcome of this work will include the compilation of a lexicon providing an overview of all Old Albanian verbs.
As project leader Dr. Schumacher explains, the research is already bearing fruit: "So far, our work has shown that Old Albanian contained numerous modal levels that allowed the speaker to express a particular stance to what was being said. Compared to the existing knowledge and literature, these modal levels are actually more extensive and more nuanced than previously thought. We have also discovered a great many verbal forms that are now obsolete or have been lost through restructuring - until now, these forms have barely even been recognized or, at best, have been classified incorrectly." These verbal forms are crucial to explaining the linguistic history of Albanian and its internal usage.
However, they can also shed light on the reciprocal relationship between Albanian and its neighbouring languages. The researchers are following various leads which suggest that Albanian played a key role in the Balkan Sprachbund. For example, it is likely that Albanian is the source of the suffixed definite article in Romanian, Bulgarian and Macedonian, as this has been a feature of Albanian since ancient times.

Timeline of contacts

Most likely the earliest contact was between the Proto-Romanians and Proto-Albanians, (1st century - 5th century AD) this theory being supported by the Albanian vocabulary borrowed from Balkan Latin, as well as the Romanian substrate, which has words cognate to Albanian words.

The exact area where contact occurred is under debate, ranging from Northern Albania
Albania
Albania , officially known as the Republic of Albania , is a country in Southeastern Europe, in the Balkans region. It is bordered by Montenegro to the northwest, Kosovo to the northeast, the Republic of Macedonia to the east and Greece to the south and southeast. It has a coast on the Adriatic Sea...

 to Transylvania
Transylvania
Transylvania is a historical region in the central part of Romania. Bounded on the east and south by the Carpathian mountain range, historical Transylvania extended in the west to the Apuseni Mountains; however, the term sometimes encompasses not only Transylvania proper, but also the historical...

. For more, see Origin of Romanians
Origin of Romanians
The origin of the Romanians – the ethnogenesis of the Romanian people – can be traced back to the region’s Romanized inhabitants living, within the Roman Empire, in the lands north of the Jireček Line The origin of the Romanians – the ethnogenesis of the Romanian people (speakers of a Romance...

 and Origin of Albanians
Origin of Albanians
The origin of the Albanians has been for some time a matter of dispute among historians. Most historians conclude that the Albanians are descendants of populations of the prehistoric Balkans, such as the Illyrians, Dacians or Thracians...

. All Romanian varieties (from the Republic of Moldova to the Vlachs of Serbia) are part of the sprachbund, which shows that the contact happened before they diverged.

The invasion of the Slavs led to a period of migrations throughout the Balkans which created multi-ethnic communities and this led to the sprachbund beginning around the 8th century; most features were present by the 12th century, but in some parts it continued until the 17th century.

Case system

The number of cases is reduced, several cases being replaced with prepositions, the only exception being Serbian. In Bulgarian and Macedonian, on the other hand, this development has actually led to the loss of all cases except the vocative
Vocative case
The vocative case is the case used for a noun identifying the person being addressed and/or occasionally the determiners of that noun. A vocative expression is an expression of direct address, wherein the identity of the party being spoken to is set forth expressly within a sentence...

.

A common case system of a Balkan language is:
  • Nominative
    Nominative case
    The nominative case is one of the grammatical cases of a noun or other part of speech, which generally marks the subject of a verb or the predicate noun or predicate adjective, as opposed to its object or other verb arguments...

  • Accusative
    Accusative case
    The accusative case of a noun is the grammatical case used to mark the direct object of a transitive verb. The same case is used in many languages for the objects of prepositions...

  • Dative
    Dative case
    The dative case is a grammatical case generally used to indicate the noun to whom something is given, as in "George gave Jamie a drink"....

     / Genitive
    Genitive case
    In grammar, genitive is the grammatical case that marks a noun as modifying another noun...

     (merged
    Syncretism (linguistics)
    In linguistics, syncretism is the identity of form of distinct morphological forms of a word. This phenomenon is typical of fusional languages....

    )
  • Vocative
    Vocative case
    The vocative case is the case used for a noun identifying the person being addressed and/or occasionally the determiners of that noun. A vocative expression is an expression of direct address, wherein the identity of the party being spoken to is set forth expressly within a sentence...


Syncretism of genitive and dative

In the Balkan languages, the genitive
Genitive case
In grammar, genitive is the grammatical case that marks a noun as modifying another noun...

 and dative
Dative case
The dative case is a grammatical case generally used to indicate the noun to whom something is given, as in "George gave Jamie a drink"....

 cases (or corresponding prepositional constructions) undergo syncretism
Syncretism (linguistics)
In linguistics, syncretism is the identity of form of distinct morphological forms of a word. This phenomenon is typical of fusional languages....

.

Example:
Language Dative Genitive
English I gave the book to Maria. It is Maria's book.
Albanian Librin i'a (ja) dhashë Marisë. Libri është i Marisë.
Aromanian U-ded vivliapi Maria. Easte vivlia ali Marie.
Bulgarian Дадох книгата на Мария
[dadoh knigata na Marija]
Книгата е на Мария
[knigata e na Marija]
Romanian I-am dat cartea Mariei.
colloq. for fem. (oblig. for masc.):
I-am dat cartea lui Marian.
Este cartea Mariei.
colloq. for fem. (oblig. for masc.):
Este cartea lui Marian.
Macedonian Ѝ ја дадов книгата на Марија.
[ì ja dadov knigata na Marija]
Книгата е на Марија.
[knigata e na Marija]

Greek
Έδωσα το βιβλίο στην Μαρία.
[édhosa to vivlío stin María]
     or
Έδωσα το βιβλίο της Μαρίας.
[édhosa to vivlío tis Marías]
Είναι το βιβλίο της Μαρίας.
[íne to vivlío tis Marías]
Της το έδωσα
[tis to édhosa]
'I gave it to her.'
Είναι το βιβλίο της.
[íne to vivlío tis]
'It is her book.'

Syncretism of locative and directional expressions
language "in Greece" "into Greece"
Albanian në Greqi në Greqi
Aromania tu Gârția tu Gârția
Bulgarian в Гърция (v Gărcija) в Гърция (v Gărcija)
Greek στην Ελλάδα (stin Elládha) στην Ελλάδα (stin Elládha)
Macedonian Во Грција (vo Grcija) Во Грција (vo Grcija)
Romanian în Grecia în Grecia

Future tense

The future tense is formed in an analytic way using an auxiliary verb or particle with the meaning "will, want", referred to as de-volitive, similar to the way the future is formed in English. This feature is present to varying degrees in each language. Decategoralization is less advanced in Romanian voi and in Serbian ću, ćeš, će, where the future marker is still an inflected auxiliary. In Modern Greek, Bulgarian, Macedonian, and Albanian, decategoralization and erosion have given rise to an uninflected tense form, where the frozen 3rd person singular of the verb has turned into an invariable particle followed by the main verb inflected for person.
Language Variant Formation Example: "I'll see"
Albanian Tosk "do" (invariant) + subjunctive Do të shoh
Gheg "kam" (conjugated) + infinitive Kam me pa
Aromanian "va" (invariant) + subjunctive Va s-ved
Greek "θα" (invariant) + subjunctive Θα δω / βλέπω (tha dho / vlépo); "I'll see / be seeing"
Bulgarian "ще" (invariant) + present tense Ще видя (shte vidya)
Macedonian "ќе" (invariant) + present tense Ќе видам (kje vidam)
Serbian (literary standard) "хтети/hteti" (conjugated) + infinitive Ја ћу видети (видећу) (ja ću videti [videću])
(colloquial) "хтети/hteti" (conjugated) + subjunctive Ја ћу да видим (ja ću da vidim)
Romanian (literary standard) "a voi" (conjugated) + infinitive Voi vedea/vedere
(Note: Compare to Spanish "voy a ver")
(colloquial) "o" (invariant) + subjunctive O să văd
(colloquial alternate) "a avea" (conjugated) + subjunctive Am să văd
(archaic) "va" (invariant) + subjunctive Va să văd
Romani (Erli) "ka" (invariant) + subjunctive Ka dikhav

Analytic perfect tense

The analytic perfect tense is formed in the Balkan languages with the verb "to have" and, usually, a past passive participle, similarly to the construction found in Germanic and other Romance languages: e.g. Romanian am promis "I have promised", Albanian kam premtuar "I have promised". A somewhat less typical case of this is Greek, where the verb "to have" is followed by the so-called απαρέμφατο ('invariant form', historically the aorist infinitive): έχω υποσχεθεί. However, a completely different construction is used in Bulgarian and Serbian, which have inherited from Common Slavic an analytic perfect formed with the verb "to be" and the past active participle: обещал съм, obeštal sǎm (Bul.) / обећао сам, obećao sam (Ser.) - "I have promised" (lit. "I am one who has promised"). On the other hand, Macedonian, the third Slavic language in the Sprachbund, is like Romanian and Albanian in that it uses quite typical Balkan constructions consisting of the verb to have and a past passive participle (имам ветено, imam veteno = "I have promised").
Avoidance or loss of infinitive

The use of the infinitive (common in other languages related to some of the Balkan languages, such as Romance and Slavic) is generally replaced with subjunctive constructions, following early Greek innovation.
  • in Bulgarian, Macedonian and Tosk Albanian, the loss of the infinitive is complete
  • in demotic Greek, the loss of the infinitive was complete, whereas in literary Greek it was not; the natural fusion of the demotic (vernacular) form with the literary (archaic) one resulted in the creation of the contemporary common Greek (Koine Neohellenic), where the infinitive, when used, is principally used as noun (e.g. λέγειν "speaking, fluency, eloquence", γράφειν "writing", είναι "being", etc.) deriving directly from the ancient Greek infinitive formation. But its substitution by the subjunctive form when the infinitive would be used as a verb is complete. Most of the times, the subjunctive form substitutes the infinitive also in the cases when it would be used as a noun (e.g. το να πας/το να πάει κανείς "to go, the act of going", το να δεις/βλέπεις "to see/be seeing, the act of seeing" instead of the infinitive "βλέπειν", etc.)
  • in Aromanian and Southern Serbian dialects, it is almost complete
  • in Gheg Albanian and Megleno-Romanian, it is used only in a limited number of expressions
  • in standard Romanian and Serbian, the infinitive shares many of its functions with the subjunctive. In these two languages, the infinitive will always be found in dictionaries and language textbooks. In Romanian, the long infinitives, which are identical to the Italian ones (-are, -ere, and -ire) can also be used in both formal and informal conversation.
  • Turkish as spoken in Sliven
    Sliven
    Sliven is the eighth-largest city in Bulgaria and the administrative and industrial centre of Sliven Province and municipality. It is a relatively large town with 89,848 inhabitants, as of February 2011....

     and Šumen
    Shumen
    Shumen is the tenth-largest city in Bulgaria and capital of Shumen Province. In the period 1950–1965 it was called Kolarovgrad, after the name of the communist leader Vasil Kolarov...

     has also almost completely lost the infinitive, clearly due to the influence of the Balkan Sprachbund.


For example, "I want to write" in several Balkan languages:
Language Example Notes
Albanian "Dua të shkruaj" as opposed to Gheg me fjet "to sleep" or me hangër "to eat"
Aromanian "Voi să-ngrăpsescu"
Macedonian "Сакам да пишувам" [sakam da pišuvam]
Bulgarian "Искам да пиша" [iskam da piša]
Modern Greek "Θέλω να γράψω" as opposed to Ancient Greek "βούλομαι γράψαι"
Romanian "Vreau să scriu" (with subjunctive)

Vreau a scrie/scriere (with infinitive)
The use of the infinitive is preferred in writing in some cases only. In speech it is more commonly used in the northern varieties (Transylvania
Transylvanian varieties of Romanian
The Transylvanian varieties of Romanian are a grouping of speech varieties of the Romanian language, specifically of the Daco-Romanian dialect...

, Banat
Banat subdialect of Romanian
The Banat subdialect is one of the several subdialects of the Romanian language, specifically of the Daco-Romanian dialect...

, and Moldova
Moldavian subdialect of Romanian
The Moldavian subdialect is one of the several subdialects of the Romanian language...

) than in Southern varieties
Wallachian subdialect of Romanian
The Wallachian subdialect is one of the several subdialects of the Romanian language, specifically of the Daco-Romanian dialect. Its geographic distribution covers approximately the historical region of Wallachia, occupying the southern part of Romania, roughly between the Danube and the Southern...

 (Wallachia) of the language.
Serbian "Želim da pišem"/"Желим да пишем as opposed to the more literary form: "Želim pisati"/"Желим пиcaти, where pisati/пиcaти is the infinitive. Both phrases are correct and do not create misunderstandings, although the colloquial one is more commonly used in daily conversation.
Bulgarian Turkish "isterim yazayım" In Standard Turkish in Turkey this is "yazmak istiyorum" where "yazmak" is the infinitive.
Romani (Erli) "Mangav te pišinav" Many forms of Romani add the ending -a to express the indicative present, while reserving the short form for the subjunctive serving as an infinitive: e.g. "mangava te pišinav". Some varieties outside the Balkans have been influenced by non-Balkan languages and have developed new infinitives by generalizing one of the finite forms (e.g. Slovak Romani varieties may express "I want to write" as "kamav te irinel/pisinel" - generalized third person singular - or "kamav te irinen/pisinen" - generalized third person plural).

But here is an example of a relict form, preserved in Bulgarian:
Language Without infinitive With relict "infinitive" Translation Notes
Bulgarian "Недей да пишеш." "Недей писа." Don't write. The first part of the first three examples is the prohibitative element недей ("don't", composed of не, "not", and дей, "do" in the imperative
Imperative mood
The imperative mood expresses commands or requests as a grammatical mood. These commands or requests urge the audience to act a certain way. It also may signal a prohibition, permission, or any other kind of exhortation.- Morphology :...

). The second part of the examples, писа, я, зна and да, are relicts of what used to be an infinitive form (писати, ясти, знати and дати respectively). This second syntactic construction is colloquial and more common in the eastern dialects. The forms usually coincide with the past aorist tense of the verb in the third person singular, as in the case of писа; those that don't coincide (as in the last three examples) are highly unusual today, but do occur, above all in older literature.
"Недей да ядеш." "Недей я." Don't eat.
"Недей да знаеш." "Недей зна." Don't know.
"Можете ли да ми дадете?" "Можете ли ми да?" Can you give me?

Bare subjunctive constructions

Sentences which include only a subjunctive construction can be used to express a wish, a mild command, an intention or a suggestion.

This example translates in the Balkan languages the phrase "You should go!", using the subjunctive constructions.
Language Example Notes
Macedonian Да (си) одиш! "Оди" [odi] in the imperative is more common, and has the identical meaning.
Bulgarian Да си ходиш!
Serbian Да идеш! "Иди!" in the imperative is grammatically correct, and has the identical meaning.
Albanian Të shkosh! "Shko!" in the imperative is grammatically correct. "Të shkosh" is used in sentence only followed by a modal verbs, ex. in these cases: Ti duhet të shkosh (You should go), Ti mund të shkosh (You can go) etc.
Modern Greek Να πας!
Romany (Gypsy) Te dža!
Romanian Să te duci!
  • compare with similar Spanish "¡Que te largues!"
  • in Romanian, the "a se duce" (to go) requires a reflexive
    Reflexive verb
    In grammar, a reflexive verb is a verb whose semantic agent and patient are the same. For example, the English verb to perjure is reflexive, since one can only perjure oneself...

     construction, literally "take yourself (to)"
Megleno-Romanian S-ti duț!
Aromanian S-ti duț!

Postposed article

With the exception of Greek, Serbian and Romani, all languages in the union have their definite article
Definite Article
Definite Article is the title of British comedian Eddie Izzard's 1996 performance released on VHS. It was recorded on different nights at the Shaftesbury Theatre...

 attached to the end of the noun, instead of before it. None of the related languages (like other Romance languages or Slavic languages) share this feature and it is thought to be either an innovation or Albanian borrowing spread in the Balkans.

However, each language created its own internal articles, so the Romanian articles are related to the articles (and demonstrative
Demonstrative
In linguistics, demonstratives are deictic words that indicate which entities a speaker refers to and distinguishes those entities from others...

 pronoun
Pronoun
In linguistics and grammar, a pronoun is a pro-form that substitutes for a noun , such as, in English, the words it and he...

s) in Italian, French, etc., while the Bulgarian articles are related to demonstrative pronouns in other Slavic languages.
Language Feminine Masculine
without
article
with
article
without
article
with
article
English woman the woman man the man
Albanian grua gruaja burr burri
Aromanian muľare muľarea bărbat bărbatlu
Bulgarian жена жената мъж мъжът
Macedonian жена жената маж мажот
Romanian femeie
muiere
femeia
muierea
bărbat bărbatul
Torlakian
Torlakian dialect
Torlakian or Torlak is a name given to the group of South Slavic dialects of southeastern Serbia , northeastern Macedonia , western Bulgaria , which is intermediate between Serbian, Bulgarian and Macedonian.Some linguists classify it as an Old-Shtokavian dialect of Serbian or a fourth dialect of...

жена жената муж мужът

Number formation

The Slavic way of composing the numbers between 10 and 20, e.g. "one + on + ten" for eleven, called superessive, is widespread.
Greek does not follow this.
Language The word "Eleven" compounds
Albanian "njëmbëdhjetë" një + mbë + dhjetë
Aromanian "unăspră" ună + spră
Bulgarian "единадесет" един + (н)а(д) + десет
Macedonian "единаесет" еде(и)н + (н)а(д) + (д)есет
Romanian "unsprezece" or, more commonly, "unșpe" un + spre + zece < *unu + supre + dece; unu + spre; the latter is more commonly used, even in formal speech.
Bosnian/Croatian/Serbian "jedanaest/једанаест" jedan+ (n)a+ (d)es(e)t/један + (н)а + (д)ес(е)т. This is not the case only with South Slavic languages. This word is formed in the same way in most Slavic languages, e.g. Polish - "jedenaście", Czech - "jedenáct", Slovak - "jedenásť", Russian - "одиннадцать", Ukrainian - "одинадцять", etc.

Clitic pronouns

Direct and indirect objects are cross-referenced, or doubled
Clitic doubling
In linguistics, clitic doubling, or pronominal reduplication is a phenomenon by which clitic pronouns appear in verb phrases together with the full noun phrases that they refer to .Clitic doubling is found in many languages, including Albanian, Arumanian, Macedonian, Bulgarian,...

, in the verb phrase by a clitic
Clitic
In morphology and syntax, a clitic is a morpheme that is grammatically independent, but phonologically dependent on another word or phrase. It is pronounced like an affix, but works at the phrase level...

 (weak) pronoun, agreeing with the object in gender, number, and case or case function. This can be found in Romanian (although mostly optional), Greek, Bulgarian, Macedonian, and Albanian. In Albanian and Macedonian, this feature shows fully grammaticalized structures and is obligatory with indirect objects and to some extent with definite direct objects; in Bulgarian, however, it is optional and therefore based on discourse. In Greek, the construction contrasts with the clitic-less construction and marks the cross-referenced object as a topic. Southwest Macedonia appears to be the location of innovation.

For example, "I see George" in Balkan languages:
Language Example
Albanian "E shoh Gjergjin"
Aromanian "U- ved Yioryi"
Bulgarian "Виждам го Георги." (colloquial form; see note)
Macedonian "Гo гледам Ѓорѓи."
Greek "Τον βλέπω τον Γιώργο"
Romanian "Îl văd pe Gheorghe." or simply "Văd pe Gheorghe."


Note: The neutral case in normal (SVO) word order is without a clitic: "Виждам Георги." However, the form with an additional clitic pronoun is also possible in colloquial speech: "Виждам го Георги." And the clitic is obligatory in the case of a topicalized object (with OVS-word order), which serves also as the common colloquial equivalent of a passive construction. "Георги го виждам."
Adjectives

The replacement of synthetic adjectival comparative forms with analytic ones by means of preposed markers is common. These markers are:
  • Bulgarian: по-
  • Macedonian: по (prepended)
  • Albanian: më
  • Romanian: mai
  • Modern Greek: πιο (pió)
  • Aromanian: (ca)ma


Macedonian and Modern Greek have retained some of the earlier synthetic forms. In Macedonian these have become proper adjectives in their own right without the possibility of [further] comparison (ex. виш, "higher, superior"; ниж, "lower, inferior").
Suffixes

Also, some common suffixes can be found in the linguistic area, such as the diminutive suffix of the Slavic languages (Srb. Bul. Mac.) "-ovo" "-ica" that can be found in Albanian, Greek and Romanian.

Loan words

Several hundred words are common to the Balkan union languages; the origin of most of them is either Greek
Greek language
Greek is an independent branch of the Indo-European family of languages. Native to the southern Balkans, it has the longest documented history of any Indo-European language, spanning 34 centuries of written records. Its writing system has been the Greek alphabet for the majority of its history;...

, Bulgarian
Bulgarian language
Bulgarian is an Indo-European language, a member of the Slavic linguistic group.Bulgarian, along with the closely related Macedonian language, demonstrates several linguistic characteristics that set it apart from all other Slavic languages such as the elimination of case declension, the...

 or Turkish
Turkish language
Turkish is a language spoken as a native language by over 83 million people worldwide, making it the most commonly spoken of the Turkic languages. Its speakers are located predominantly in Turkey and Northern Cyprus with smaller groups in Iraq, Greece, Bulgaria, the Republic of Macedonia, Kosovo,...

, as the Byzantine Empire
Byzantine Empire
The Byzantine Empire was the Eastern Roman Empire during the periods of Late Antiquity and the Middle Ages, centred on the capital of Constantinople. Known simply as the Roman Empire or Romania to its inhabitants and neighbours, the Empire was the direct continuation of the Ancient Roman State...

, the First Bulgarian Empire
First Bulgarian Empire
The First Bulgarian Empire was a medieval Bulgarian state founded in the north-eastern Balkans in c. 680 by the Bulgars, uniting with seven South Slavic tribes...

, the Second Bulgarian Empire
Second Bulgarian Empire
The Second Bulgarian Empire was a medieval Bulgarian state which existed between 1185 and 1396 . A successor of the First Bulgarian Empire, it reached the peak of its power under Kaloyan and Ivan Asen II before gradually being conquered by the Ottomans in the late 14th-early 15th century...

 and later the Ottoman Empire
Ottoman Empire
The Ottoman EmpireIt was usually referred to as the "Ottoman Empire", the "Turkish Empire", the "Ottoman Caliphate" or more commonly "Turkey" by its contemporaries...

 directly controlled the territory throughout most of its history, strongly influencing its culture and economics.

Albanian, Aromanian, Bulgarian, Greek, Romanian, Serbian and Macedonian also share a large number of words of various origins:
Source Source word Meaning Albanian Aromanian Bulgarian Greek Romanian Macedonian Serbian Turkish
Latin mensa table mensa (tavolinë) masã маса (masa) - masă маса (masa) - masa
Thracian rompea spear - roféja руфия (rufiya) - dialectal, meaning "thunderbolt" ρομφαία (rhomphaía) - - -
Byzantine Greek λιβάδιον (libádion) meadow lëndinë livadã ливада (livada) λιβάδι livadă ливада (livada) livada
ливада (livada)
-
Byzantine Greek διδάσκαλος (didáskalos) teacher - dascal даскал (daskal) (colloquial) δάσκαλος dascăl даскал (daskal) (colloquial) даскал (daskal) (colloquial) -
Byzantine Greek κουτίον (koutíon) box kuti cutii кутия (kutiya) κουτί cutie кутија (kutija) kutija
кутија (kutija)
kutu
Turkish boya paint, color bojë (but also ngjyrë) boi боя (boya) μπογιά (boyá) boia боја (boja) boja
боја (boja)
boya

Calques

Apart from the direct loans, there are also many calques that were passed from one Balkan languages to another, most of them between Albanian, Macedonian, Bulgarian, Greek, Aromanian and Romanian.

For example, the word "ripen" (as in fruit) is constructed in Albanian, Romanian and (rarely) in Greek (piqem, a (se) coace, ψήνομαι), in Turkish pişmek by a derivation from the word "to bake" (pjek, a coace, ψήνω).

Another example is the wish "(∅/to/for) many years":
Language Expression Transliteration
Greek (medieval) εις έτη πολλά is eti polla
(modern) χρόνια πολλά khronia polla
Latin ad multos annos  
Aromanian ti mulț ań  
Romanian la mulţi ani  
Albanian për shumë vjet
Bulgarian за много години za mnogo godini
Macedonian за многу години za mnogu godini
Serbian за многo годинa za mnogo godina


Idiomatic expressions for "whether one or not" are formed as "-not-".
Language expression meaning
Bulgarian ще - не ще "whether one wants or not"
Greek θέλει δε θέλει "whether one wants or not"
Romanian vrea nu vrea "whether one wants or not"
Turkish ister istemez "whether one wants or not"
Serbian Hteo- ne hteo/хтео - не хтео "whether one wants or not"
Albanian deshti - nuk deshti "whether one wants or not"
Macedonian сакал - не сакал / нејќел "whether one wants or not"
Aromanian i vrei - i nu vrei "whether one wants or not"

Phonetics

The main phonological features consist of:
  • the presence of an unrounded central vowel, either a mid-central schwa
    Schwa
    In linguistics, specifically phonetics and phonology, schwa can mean the following:*An unstressed and toneless neutral vowel sound in some languages, often but not necessarily a mid-central vowel...

     /ə/ or a high central vowel phoneme
    • ë in Albanian; ъ in Bulgarian; ă in Romanian; ã in Aromanian
    • In Romanian and Albanian, the schwa is obtained via centralizing unstressed /a/
      • Example: Latin camisia "shirt" > Romanian cămașă /kə.ma.ʃə/, Albanian këmishë /kə.mi.ʃə/)
    • The schwa phoneme occurs across some dialects of the Macedonian language, but is absent in the standard
      Standard Macedonian
      Standard Macedonian or Literary Macedonian is the standard variety of the Macedonian language and official language of the Republic of Macedonia used as a written language, in formal contexts, and for communication between different dialect areas...

      .
  • some kind of vowel harmony in stressed syllables with differing patterns depending on the language.
    • Romanian: a mid-back vowel ends in a low glide before a nonhigh vowel in the following syllable
    • Albanian: back vowels are fronted before i in the following syllable.


This feature also occurs in Greek, but it is lacking in some of the other Balkan languages; the central vowel is found in Romanian, Bulgarian, some dialects of Albanian, and Serbian, but not in Greek or Standard Macedonian.

Less widespread features are confined largely to either Romanian or Albanian, or both:
  • frequent loss of l before i in Romanian and some Romani dialects
  • the alternation
    Alternation (linguistics)
    In linguistics, an alternation is the phenomenon of a phoneme or morpheme exhibiting variation in its phonological realization. Each of the various realizations is called an alternant...

     between n and r in Albanian and Romanian.
  • change from l to r in Romanian, Greek and very rarely in Bulgarian and Albanian.
  • the raising of o to u in unstressed syllables in Bulgarian, Romanian and Northern Greek dialects.
  • change from ea to e before i in Bulgarian and Romanian.

See also

  • Serbo-Croatian grammar
    Serbo-Croatian grammar
    Serbo-Croatian is a South Slavic language with moderately complex verbal and nominal systems. This article deals exclusively with the Neo-Shtokavian dialect, the basis for the official standard of Yugoslavia and its present-day forms of Bosnian, Croatian, Montenegrin, and Serbian.All Serbo-Croatian...

  • Paleo-Balkan languages
    Paleo-Balkan languages
    Paleo-Balkan is a geolinguistic term referring to the Indo-European languages that were spoken in the Balkans in ancient times. Except for Greek and the language that gave rise to Albanian , they are all extinct, due to Hellenization, Romanization, and Slavicisation.- Classification :The following...

  • Balkan languages
    Balkan languages
    This is a list of languages spoken in regions ruled by Balkan countries. With the exception of several Turkic languages, Hungarian, and Circassian, all of them belong to the Indo-European family...

  • Albanian grammar
  • Greek grammar
    Greek grammar
    The grammar of Standard Modern Greek, as spoken in present-day Greece and Cyprus, is basically that of Demotic Greek, but it has also assimilated certain elements of Katharevousa, the archaic, learned variety of Greek imitating Classical Greek forms, which used to be the official language of Greece...

  • Bulgarian grammar
    Bulgarian grammar
    Bulgarian grammar is the grammar of the Bulgarian language. Bulgarian language is a South Slavic language, historically Bulgarian language evolved from the Old Bulgarian language, also known as Old Slavonic language which was the written norm for the Slavic languages in the Middle ages, and before...

  • Macedonian grammar
    Macedonian grammar
    Macedonian grammar refers to the morphology and syntax of the Macedonian language, which is, in many respects, similar to the grammar of some other Balkan languages – especially Bulgarian and Serbian...

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK