Genetics and Archaeogenetics of South Asia
Encyclopedia
The study of the genetics
Genetics
Genetics , a discipline of biology, is the science of genes, heredity, and variation in living organisms....

 and archaeogenetics
Archaeogenetics
Archaeogenetics, a term coined by Colin Renfrew, refers to the application of the techniques of molecular population genetics to the study of the human past. This can involve:*the analysis of DNA recovered from archaeological remains, i.e...

of the ethnic groups of
Ethnic groups of South Asia
The ethno-linguistic composition of the population of South Asia, that is the nations of India, Pakistan, Bangladesh, Nepal, Bhutan, Maldives and Sri Lanka is highly diverse. The majority of the population fall within two large Linguistic groups, Indo-Aryan and Dravidian.These groups are further...

 South Asia
South Asia
South Asia, also known as Southern Asia, is the southern region of the Asian continent, which comprises the sub-Himalayan countries and, for some authorities , also includes the adjoining countries to the west and the east...

aims at uncovering these groups' genetic history. The geographic position of India makes Indian populations important for the study of the early dispersal of all human populations on the Eurasian continent.

The Indian Genome Variation Consortium observed high levels of genetic divergence between groups of populations that cluster largely on the basis of ethnicity and language. Studies based on mtDNA variation have also reported genetic similarities amongst the various Indian sub-populations. Recent research based on molecular studies and the archaeological record have also suggested an autochthonous differentiation of the genetic structure of the populations in South Asia.

It has been found that the ancestral node of the phylogenetic tree of all the mtDNA types typically found in Central Asia
Central Asia
Central Asia is a core region of the Asian continent from the Caspian Sea in the west, China in the east, Afghanistan in the south, and Russia in the north...

, the Middle East
Middle East
The Middle East is a region that encompasses Western Asia and Northern Africa. It is often used as a synonym for Near East, in opposition to Far East...

 and Europe
Europe
Europe is, by convention, one of the world's seven continents. Comprising the westernmost peninsula of Eurasia, Europe is generally 'divided' from Asia to its east by the watershed divides of the Ural and Caucasus Mountains, the Ural River, the Caspian and Black Seas, and the waterways connecting...

 are also to be found in South Asia
South Asia
South Asia, also known as Southern Asia, is the southern region of the Asian continent, which comprises the sub-Himalayan countries and, for some authorities , also includes the adjoining countries to the west and the east...

 at relatively high frequencies. The inferred divergence of this common ancestral node is estimated to have occurred slightly less than 50,000 years ago. In India the major maternal lineages, or mitochondrial DNA Haplogroups, are M
Haplogroup M (mtDNA)
In human mitochondrial genetics, Haplogroup M is a human mitochondrial DNA haplogroup. An enormous haplogroup spanning all the continents, the macro-haplogroup M, like its sibling N, is a descendant of haplogroup L3....

, R and U, whose coalescence times have been approximated to 50,000 BP. The major paternal lineages represented by Y chromosome
Y chromosome
The Y chromosome is one of the two sex-determining chromosomes in most mammals, including humans. In mammals, it contains the gene SRY, which triggers testis development if present. The human Y chromosome is composed of about 60 million base pairs...

s are haplogroups R1a
Haplogroup R1a (Y-DNA)
Haplogroup R1a is the phylogenetic name of a major clade of Human Y-chromosome DNA haplogroups. In other words, it is a way of grouping a significant part of all modern men according to a shared male-line ancestor. It is common in many parts of Eurasia and is frequently discussed in human...

, R2, H
Haplogroup H (Y-DNA)
In human genetics, Haplogroup H is a Y-chromosome haplogroup.This haplogroup is found at a high frequency in South Asia. It is generally rare outside of the South Asia but is common among the Romani people, particularly the H-M82 subgroup.-Origins:...

, L
Haplogroup L (Y-DNA)
In human genetics, Haplogroup L is a Y-chromosome DNA haplogroup.-Origins:Haplogroup L is associated with South Asia. It has also been found at low frequencies among populations of Central Asia, Southwest Asia, and Southern Europe along the coast of the Mediterranean Sea...

 and J2
Haplogroup J2 (Y-DNA)
In human genetics, Haplogroup J2 is a Y-chromosome haplogroup which is a subdivision of haplogroup J. It is further divided into two complementary clades, J2a-M410 and J2b-M12.-Origins:...

. Many researchers have argued that Y-DNA Haplogroup R1a1 (M17) is of autochthonous Indian origin. However, proposals for a Central Asian origin for R1a1 are also quite common.

mtDNA

The largest Indian mtDNA haplogroups are M
Haplogroup M (mtDNA)
In human mitochondrial genetics, Haplogroup M is a human mitochondrial DNA haplogroup. An enormous haplogroup spanning all the continents, the macro-haplogroup M, like its sibling N, is a descendant of haplogroup L3....

, R
Haplogroup R (mtDNA)
In human mitochondrial genetics, haplogroup R is a very extended mitochondrial DNA haplogroup and is the most common macro-haplogroup in West Eurasia.Haplogroup R is a descendant of macro-haplogroup N...

 and U
Haplogroup U (mtDNA)
In human mitochondrial genetics, Haplogroup U is a human mitochondrial DNA haplogroup.-Origins:Haplogroup U descends from a woman in the Haplogroup R branch of the phylogenetic tree, who lived around 55,000 years ago...

 

Arguing for the longer term "rival Y-Chromosome model", Stephen Oppenheimer
Stephen Oppenheimer
Stephen Oppenheimer is a British paediatrician, geneticist, and writer. He is a member of Green Templeton College, Oxford and an honorary fellow of Liverpool School of Tropical Medicine, and carries out and publishes research in the fields of genetics and human prehistory.-Career:Oppenheimer...

 believes that it is highly suggestive that India is the origin of the Eurasia
Eurasia
Eurasia is a continent or supercontinent comprising the traditional continents of Europe and Asia ; covering about 52,990,000 km2 or about 10.6% of the Earth's surface located primarily in the eastern and northern hemispheres...

n mtDNA haplogroups which he calls the "Eurasian Eves". According to Oppenheimer it is highly probable that nearly all human maternal lineages in Central Asia, the Middle East and Europe descended from only four mtDNA lines that originated in South Asia 50,000-100,000 years ago.

Macrohaplogroup M

The macrohaplogroup M
Haplogroup M (mtDNA)
In human mitochondrial genetics, Haplogroup M is a human mitochondrial DNA haplogroup. An enormous haplogroup spanning all the continents, the macro-haplogroup M, like its sibling N, is a descendant of haplogroup L3....

 which is considered as a cluster of the proto-Asian maternal lineages, represents more than 60% of Indian MtDNA.

The M macrohaplotype in India includes many subgroups that differ profoundly from other sublineages in East Asia especially Mongoloid populations. The deep roots of M phylogeny clearly ascertain the relic of Indian lineages as compared to other M sub lineages (in East Asia and elsewhere) suggesting 'in-situ' origin of these sub-haplogroups in South Asia, most likely in India. These deep rooting lineages are not language specific and spread over all the language groups in India.

Virtually all modern Central Asian MtDNA M lineages seem to belong to the Eastern Eurasian (Mongolian) rather than the Indian subtypes of haplogroup M, which indicates that no large-scale migration from the present Turkic
Turkic languages
The Turkic languages constitute a language family of at least thirty five languages, spoken by Turkic peoples across a vast area from Eastern Europe and the Mediterranean to Siberia and Western China, and are considered to be part of the proposed Altaic language family.Turkic languages are spoken...

-speaking populations of Central Asia occurred to India. The absence of haplogroup M in Europeans, compared to its equally high frequency among Indians, eastern Asians and in some Central Asian populations contrasts with the Western Eurasian leanings of South Asian paternal lineages.

Most of the extant mtDNA boundaries in South and Southwest Asia were likely shaped during the initial settlement of Eurasia by anatomically modern humans.
Haplogroup Important Sub clades Populations
M2 M2a, M2b Throughout the continent except in Northwest
Peaking in Bangladesh, Andhra Pradesh, coastal Tamil Nadu and Sri Lanka
M3 M3a All the subcontinent except the Northeast
20% in Rajastan and Madhya Pradesh, being also very dense in Maharastra, Uttar Pradesh, Haryana, Gujarat, Karnataka
M4 M4a Peaks in Pakistan and Kashmir
M6 M6a,M6b Kashmir and near the coasts of the Bay of Bengal, Srilanka
M18 Throughout the subcontinent
Peaking at Rajastan and Andhra Pradesh
M25 Widespread in most of India (but rare outside it)
western Maharastra and Kerala, Punjab

Macrohaplogroup R

The macrohaplogroup R
Haplogroup R (mtDNA)
In human mitochondrial genetics, haplogroup R is a very extended mitochondrial DNA haplogroup and is the most common macro-haplogroup in West Eurasia.Haplogroup R is a descendant of macro-haplogroup N...

 (a very large and old subdivision of macrohaplogroup N
Haplogroup N (mtDNA)
In human mitochondrial genetics, Haplogroup N is a human mitochondrial DNA haplogroup. An enormous haplogroup spanning many continents, the macro-haplogroup N, like its sibling M, is a descendant of haplogroup L3....

) is also widely represented and accounts for the other 40% of Indian MtDNA. A very old and most important subdivision of it is haplogroup U
Haplogroup U (mtDNA)
In human mitochondrial genetics, Haplogroup U is a human mitochondrial DNA haplogroup.-Origins:Haplogroup U descends from a woman in the Haplogroup R branch of the phylogenetic tree, who lived around 55,000 years ago...

 that, while also present in West Eurasia, has several subclades specific to South Asia.

Most important South Asian haplogroups within R:
Haplogroup Populations
R2 Distributed widely across the sub continent
R5 widely distributed by most of India.
Peaks in coastal SW India
R6 widespread at low rates across India.
Peaks among Tamils and Kashmiris
W Found in Pakistan, Kashmir and Punjab.
It is rare further east and not to be found in India.

Haplogroup U

Haplogroup U is sub group of Macrohaplogroup R. The distribution of haplogroup U is a mirror image of that for haplogroup M: the former has not been described so far among eastern Asians but is frequent in European populations as well as among Indians. Indian U lineages differ substantially from those in Europe and their coalescence to a common ancestor also dates back to about 50,000 years.
Haplogroup Populations
U2* (a parahaplogroup
Parahaplogroup
A parahaplogroup is a term used in genetics to identify a paraphyletic haplogroup.They are normally described with the name of their parent haplogroup plus an asterisk , meaning that it includes all derivates from the parent haplogroup except those mentioned elsewhere.Sometimes it is specified...

) is sparsely distributed specially in the northern half of the subcontinent. It is also found in SW Arabia.
U2a shows relatively high density in Pakistan and NW India but also in Karnataka, where it reaches its higher density.
U2b has highest concentration in Uttar Pradesh but is also found in many other places, specially in Kerala and Sri Lanka. It is also found in Oman.
U2c is specially important in Bangladesh and West Bengal.
U2l is maybe the most important numerically among U subclades in South Asia, reaching specially high concentrations (over 10%) in Uttar Pradesh, Sri Lanka, Sindh and parts of Karnataka. It also has some importance in Oman. mtDNA haplogroup U2i is dubbed "Western Eurasian" in Bamshad et al. study but "Eastern Eurasian (mostly India specific)" in Kivisild et al. study.
U7 this haplogroup has a significant presence in Gujrat, Punjab and Pakistan. The possible homeland of this haplogroup spans Indian Gujarat(highest frequency, 12%) and Iran because from there its frequency declines steeply both to the east and to the west.

Y chromosome


The major Y chromosome DNA haplogroups in the subcontinent are F's descendant haplogroups R(mostly R1a and R2), L, H and J(mostly J2).

India

Haplogroup L is currently present in the Indian population at an overall frequency of ca. 7-15%. The presence of haplogroup L is quite rare among tribal groups (ca. 5,6-7%) (Cordaux et al. 2004, Sengupta et al. 2006, Thamseem et al. 2006)

Earlier studies (e.g. Wells et al. 2001) report a very high frequency (approaching 50%) of Haplogroup L in South India appear to have been due to extrapolation from data obtained from a sample of 84 Kallars, a Tamil
Tamil language
Tamil is a Dravidian language spoken predominantly by Tamil people of the Indian subcontinent. It has official status in the Indian state of Tamil Nadu and in the Indian union territory of Pondicherry. Tamil is also an official language of Sri Lanka and Singapore...

-speaking caste of Tamil Nadu
Tamil Nadu
Tamil Nadu is one of the 28 states of India. Its capital and largest city is Chennai. Tamil Nadu lies in the southernmost part of the Indian Peninsula and is bordered by the union territory of Pondicherry, and the states of Kerala, Karnataka, and Andhra Pradesh...

, among whom 40 (approx. 48%) displayed the M20 mutation that defines Haplogroup L.

Pakistan

Haplogroup L3 (M357) is found frequently among Burusho
Burusho
The Burusho or Brusho people live in the Hunza, Nagar, and Yasin valleys of Gilgit Baltistan. They are predominantly Muslims. Their language, Burushaski, has not been shown to be related to any other.-Hunza:...

 (approx. 12%) and Pashtuns
Pashtun people
Pashtuns or Pathans , also known as ethnic Afghans , are an Eastern Iranic ethnic group with populations primarily between the Hindu Kush mountains in Afghanistan and the Indus River in Pakistan...

 (approx. 7%), with a moderate distribution among the general Pakistani population (approx. 2%). Its highest frequency can be found in south western Balochistan
Balochistan (Pakistan)
Balochistan is one of the four provinces or federating units of Pakistan. With an area of 134,051 mi2 or , it is the largest province of Pakistan, constituting approximately 44% of the total land mass of Pakistan. According to the 1998 population census, Balochistan had a population of...

 province along the Makran
Makran
The present day Makran is a semi-desert coastal strip in the south of Sindh, Balochistan, in Iran and Pakistan, along the coast of the Arabian Sea and the Gulf of Oman. The present day Makran derived its name from Maka, a satrap of Achaemenid Empire....

 coast (28%) to Indus River
Indus River
The Indus River is a major river which flows through Pakistan. It also has courses through China and India.Originating in the Tibetan plateau of western China in the vicinity of Lake Mansarovar in Tibet Autonomous Region, the river runs a course through the Ladakh district of Jammu and Kashmir and...

 delta.

L3a (PK3) is found in approximately 23% of Nuristani
Nuristani
The Nuristani people are an ethnic group Aryan-Iranian to the Nuristan region of northeastern Iran and Afghanistan. The Nuristanis are a people whose ancestors practiced what was apparently an ancient Indo-Iranian polytheistic religion until they were conquered and converted to Islam in the late...

 in northwest Pakistan
Pakistan
Pakistan , officially the Islamic Republic of Pakistan is a sovereign state in South Asia. It has a coastline along the Arabian Sea and the Gulf of Oman in the south and is bordered by Afghanistan and Iran in the west, India in the east and China in the far northeast. In the north, Tajikistan...

.

Haplogroup H

This haplogroup is found at a high frequency in South Asia
South Asia
South Asia, also known as Southern Asia, is the southern region of the Asian continent, which comprises the sub-Himalayan countries and, for some authorities , also includes the adjoining countries to the west and the east...

. It is generally rare outside of the South Asia
South Asia
South Asia, also known as Southern Asia, is the southern region of the Asian continent, which comprises the sub-Himalayan countries and, for some authorities , also includes the adjoining countries to the west and the east...

 but is common among the Romani people, particularly the H-M82 subgroup. Haplogroup H is frequently found among populations of India
India
India , officially the Republic of India , is a country in South Asia. It is the seventh-largest country by geographical area, the second-most populous country with over 1.2 billion people, and the most populous democracy in the world...

, Sri Lanka
Sri Lanka
Sri Lanka, officially the Democratic Socialist Republic of Sri Lanka is a country off the southern coast of the Indian subcontinent. Known until 1972 as Ceylon , Sri Lanka is an island surrounded by the Indian Ocean, the Gulf of Mannar and the Palk Strait, and lies in the vicinity of India and the...

, Nepal
Nepal
Nepal , officially the Federal Democratic Republic of Nepal, is a landlocked sovereign state located in South Asia. It is located in the Himalayas and bordered to the north by the People's Republic of China, and to the south, east, and west by the Republic of India...

, and Pakistan
Pakistan
Pakistan , officially the Islamic Republic of Pakistan is a sovereign state in South Asia. It has a coastline along the Arabian Sea and the Gulf of Oman in the south and is bordered by Afghanistan and Iran in the west, India in the east and China in the far northeast. In the north, Tajikistan...

.

It is a branch of Haplogroup F, and is believed to have arisen in India between 20,000 and 30,000 years ago. Its probable site of introduction is India since it is concentrated there. It seems to represent the main Y-haplogroup of the indigenous paleolithic inhabitants of India, because it is the most frequent Y-haplogroup of tribal populations (25-35%). Its presence in upper castes is quite rare (ca. 10%).

Haplogroup R2

In South Asia, the frequency of R2 lineage is around 10-15% in India
India
India , officially the Republic of India , is a country in South Asia. It is the seventh-largest country by geographical area, the second-most populous country with over 1.2 billion people, and the most populous democracy in the world...

 and Sri Lanka
Sri Lanka
Sri Lanka, officially the Democratic Socialist Republic of Sri Lanka is a country off the southern coast of the Indian subcontinent. Known until 1972 as Ceylon , Sri Lanka is an island surrounded by the Indian Ocean, the Gulf of Mannar and the Palk Strait, and lies in the vicinity of India and the...

 and 7-8% in Pakistan
Pakistan
Pakistan , officially the Islamic Republic of Pakistan is a sovereign state in South Asia. It has a coastline along the Arabian Sea and the Gulf of Oman in the south and is bordered by Afghanistan and Iran in the west, India in the east and China in the far northeast. In the north, Tajikistan...

.

India

In India, R2 percentage is around 15% among Indo-European
Indo-European
Indo-European may refer to:* Indo-European languages** Aryan race, a 19th century and early 20th century term for those peoples who are the native speakers of Indo-European languages...

 speaking groups while Dravidian speakers show it at 8%. Among social groups, very high percentages are shown by Indo-European speaking Karmali from West Bengal
West Bengal
West Bengal is a state in the eastern region of India and is the nation's fourth-most populous. It is also the seventh-most populous sub-national entity in the world, with over 91 million inhabitants. A major agricultural producer, West Bengal is the sixth-largest contributor to India's GDP...

 at 100%, Jaunpur Kshatriya from Uttar Pradesh
Uttar Pradesh
Uttar Pradesh abbreviation U.P. , is a state located in the northern part of India. With a population of over 200 million people, it is India's most populous state, as well as the world's most populous sub-national entity...

 at 87% and Kamma Chaudhary
Kamma (caste)
Kamma or the Kammavaru is a social group that are classed as Upper Shudras is found largely in the Southern Indian states of Andhra Pradesh, Tamilnadu and Karnataka. The Kamma population was 795,732 in the year 1881. According to 1921 census they constituted about 4.8% of Andhra Pradesh...

 from Andhra Pradesh
Andhra Pradesh
Andhra Pradesh , is one of the 28 states of India, situated on the southeastern coast of India. It is India's fourth largest state by area and fifth largest by population. Its capital and largest city by population is Hyderabad.The total GDP of Andhra Pradesh is $100 billion and is ranked third...

 at 73%.

Other than these, significantly high percentages are shown by the people of West Bengal
West Bengal
West Bengal is a state in the eastern region of India and is the nation's fourth-most populous. It is also the seventh-most populous sub-national entity in the world, with over 91 million inhabitants. A major agricultural producer, West Bengal is the sixth-largest contributor to India's GDP...

 at 23%, Hindu
Hindu
Hindu refers to an identity associated with the philosophical, religious and cultural systems that are indigenous to the Indian subcontinent. As used in the Constitution of India, the word "Hindu" is also attributed to all persons professing any Indian religion...

s from New Delhi
New Delhi
New Delhi is the capital city of India. It serves as the centre of the Government of India and the Government of the National Capital Territory of Delhi. New Delhi is situated within the metropolis of Delhi. It is one of the nine districts of Delhi Union Territory. The total area of the city is...

 at 20% and Baniya from Bihar
Bihar
Bihar is a state in eastern India. It is the 12th largest state in terms of geographical size at and 3rd largest by population. Almost 58% of Biharis are below the age of 25, which is the highest proportion in India....

 at 36%. It is also significantly high in many Brahmin
Brahmin
Brahmin Brahman, Brahma and Brahmin.Brahman, Brahmin and Brahma have different meanings. Brahman refers to the Supreme Self...

 groups including Punjabi Brahmins
Punjabi Brahmins
The Brahmins of the Punjab region are chiefly Saraswat Brahmins. They have a special association with the Punjab since they take their name from the river, Saraswati.-Sub-divisions:...

 (25%), Bengali Brahmins
Bengali Brahmins
The Bengali Brahmins are those Hindu Brahmins who traditionally reside in the Bengal region of the Indian subcontinent, currently comprising the Indian state of West Bengal, Tripura, Assam and Bangladesh...

 (22%), Konkanastha Brahmins (20%), Chaturvedi
Chaturvedi
The Mathur Chaturvedi Brahmins, or sometimes just Chaturvedi are an endogamous Brahmin community found mainly in western Uttar Pradesh. A Brahmin family name indicating that the title bearer's forefathers were proficient in all of the four Vedas...

s (32%), Bhargava
Bhargava
Bhargava is a common surname in Northern India and Maharashtra, mainly around Nashik. In Maharashtra, it is pronounced as Bhargave instead of Bhargava. It is also used as a first name in parts of southern India....

s (32%), Kashmiri Pandit
Kashmiri Pandit
The Kashmiri Pandits are a Hindu Brahmin community originating from Kashmir, a mountainous region in South Asia.-Background:The Hindu caste system of the region was influenced by the influx of Buddhism from the time of Asoka, around the third century BCE, and a consequence of this was that the...

s (14%) and Lingayat
Brahmins (30%).

Among tribal groups, Lodhas of West Bengal
West Bengal
West Bengal is a state in the eastern region of India and is the nation's fourth-most populous. It is also the seventh-most populous sub-national entity in the world, with over 91 million inhabitants. A major agricultural producer, West Bengal is the sixth-largest contributor to India's GDP...

 show it at 43% while Bhil
Bhil
Bhils are primarily an Adivasi people of Central India. Bhils are also settled in the Tharparkar District of Sindh, Pakistan. They speak the Bhil languages, a subgroup of the Western Zone of the Indo-Aryan languages....

 of Gujarat at 18%. Chenchu
Chenchu
The Chenchus are an aboriginal tribe of the central hill regions of Andhra Pradesh, India. Their traditional way of life has been based on hunting and gathering. The Chenchus speak the Chenchu language, a member of the Telegu branch of the Dravidian language family. In general, the Chenchu...

 and Pallan of South India at 20% and 14% respectively. Tharu of North India
North India
North India, known natively as Uttar Bhārat or Shumālī Hindustān , is a loosely defined region in the northern part of India. The exact meaning of the term varies by usage...

 shows it at 17%.

North Indian Muslims have a frequency of 11%(Sunni) and 9%(Shia), while Dawoodi Bohra Muslim in the western state of Gujarat have a frequency of 16% and Mappla Muslims of South India have a frequency of 5%. This lineage also forms 5% of Punjabi
Punjabi people
The Punjabi people , ਪੰਜਾਬੀ ), also Panjabi people, are an Indo-Aryan group from South Asia. They are the second largest of the many ethnic groups in South Asia. They originate in the Punjab region, which has been been the location of some of the oldest civilizations in the world including, the...

 males.

Pakistan

The R2 haplogroup is found in 14% of the Burusho
Burusho
The Burusho or Brusho people live in the Hunza, Nagar, and Yasin valleys of Gilgit Baltistan. They are predominantly Muslims. Their language, Burushaski, has not been shown to be related to any other.-Hunza:...

 people. Among the Hunza
Hunza
Hunza may refer to*Hunza Valley*Former State of Hunza*Hunza River*Hunza Peak*Hunza people*Hunza is the Muisca name of the city of Tunja, Colombia...

 it is found at 18% while the Parsi
Parsi
Parsi or Parsee refers to a member of the larger of the two Zoroastrian communities in South Asia, the other being the Irani community....

s show it at 20%.

Nepal

In Nepal, R2 percentages range from 2% to 26% within different groups under various studies. Newar
Newar
The Newa , Newār or Newāl) are the indigenous people and the creators of the historical civilization of Nepal's Kathmandu Valley. The valley and surrounding territory have been known from ancient times as Nepal Mandala, its limits ever changing through history.Newas have lived in the Kathmandu...

s show a significantly high frequency of 26% while people of Kathmandu show it at 10%.

Haplogroup R1a1

In South Asia R1a1 has been observed often with high frequency in a number of demographic groups.
Its parent clade Haplogroup R1a is believed to have its origins in the Indus Valley or the Eurasian Steppe, whereas its successor clade R1a1 has the highest frequency and time depth in South Asia, making it a possible locus of origin.

However, the uneven distribution of this haplogroup among South Asian castes and tribal populations makes a Central Eurasian origin of this lineage a strong possibility as well.

India

In India, high percentage of this haplogroup is observed in West Bengal Brahmins
Bengali Brahmins
The Bengali Brahmins are those Hindu Brahmins who traditionally reside in the Bengal region of the Indian subcontinent, currently comprising the Indian state of West Bengal, Tripura, Assam and Bangladesh...

 (72%) to the east, Konkanastha Brahmins (48%) to the west, Khatri
Khatri
Khatri is a caste from the northern Indian subcontinent. Khatris in India are mostly from Punjab, region but later they migrated to regions like Rajasthan, Uttar Pradesh, Delhi, Jammu, Uttarkhand, Himachal Pradesh, Gujarat, Madhya Pradesh, Haryana, Balochistan, Sindh and Khyber...

s (67%) in north and Iyenger Brahmins (31%) of south. It has also been found in several South Indian Dravidian
Dravidian
-Language and culture:*Dravidian languages, a family of languages spoken mainly in South India and North-Eastern Sri Lanka*Proto-Dravidian, a model of the common ancestor of the above languages*Elamo-Dravidian languages, a proposed language family...

-speaking Tribals
Adivasi
Adivasi is an umbrella term for a heterogeneous set of ethnic and tribal groups claimed to be the aboriginal population of India. They comprise a substantial indigenous minority of the population of India...

 including the Chenchu
Chenchu
The Chenchus are an aboriginal tribe of the central hill regions of Andhra Pradesh, India. Their traditional way of life has been based on hunting and gathering. The Chenchus speak the Chenchu language, a member of the Telegu branch of the Dravidian language family. In general, the Chenchu...

 (26%) and Valmikis of Andhra Pradesh
Andhra Pradesh
Andhra Pradesh , is one of the 28 states of India, situated on the southeastern coast of India. It is India's fourth largest state by area and fifth largest by population. Its capital and largest city by population is Hyderabad.The total GDP of Andhra Pradesh is $100 billion and is ranked third...

 as well as the Kallar of Tamil Nadu
Tamil Nadu
Tamil Nadu is one of the 28 states of India. Its capital and largest city is Chennai. Tamil Nadu lies in the southernmost part of the Indian Peninsula and is bordered by the union territory of Pondicherry, and the states of Kerala, Karnataka, and Andhra Pradesh...

 suggesting that M17 is widespread in these Southern Indians tribes.

Besides these, studies show high percentages in geographically distant groups in India such as Manipuris (50%) in the extreme North East and in Punjab (47%) to the extreme North West.

Pakistan

In Pakistan
Pakistan
Pakistan , officially the Islamic Republic of Pakistan is a sovereign state in South Asia. It has a coastline along the Arabian Sea and the Gulf of Oman in the south and is bordered by Afghanistan and Iran in the west, India in the east and China in the far northeast. In the north, Tajikistan...

 it is found at 71% among the Mohanna of Sindh Province to the south and 46% among the Baltis
Balti people
The Balti are an ethnic group of Tibetan descent with some Dardic admixture located in Gilgit-Baltistan, Pakistan and Ladakh, a region in Jammu & Kashmir, India; as well as scattered in Pakistan's major urban centres of Lahore, Karachi and Islamabad/Rawalpindi. The Balti language belongs to the...

 of Gilgit-Baltistan to the north.

Sri Lanka

In Sri Lanka
Sri Lanka
Sri Lanka, officially the Democratic Socialist Republic of Sri Lanka is a country off the southern coast of the Indian subcontinent. Known until 1972 as Ceylon , Sri Lanka is an island surrounded by the Indian Ocean, the Gulf of Mannar and the Palk Strait, and lies in the vicinity of India and the...

, 13% of the Sinhalese people
Sinhalese people
The Sinhalese are an Indo-Aryan ethnic group,forming the majority of Sri Lanka,constituting 74% of the Sri Lankan population.They number approximately 15 million worldwide.The Sinhalese identity is based on language, heritage and religion. The Sinhalese speak Sinhala, an Indo-Aryan language and the...

 were found to be R1a1a (M17) positive.

Haplogroup J2

Haplogroup J2 is likely to reflect neolithic expansion from the Middle East to the subcontinent. J2 is almost absent from tribals, but occurs among some Austro-Asiatic tribals (11%). The frequency of J2 is higher in South Indian castes (19%) than in North Indian castes (11%) or Pakistan (12%). J2 appears at 20% among the Yadavas of South India while among the Lodhas of West Bengal
West Bengal
West Bengal is a state in the eastern region of India and is the nation's fourth-most populous. It is also the seventh-most populous sub-national entity in the world, with over 91 million inhabitants. A major agricultural producer, West Bengal is the sixth-largest contributor to India's GDP...

 it is 32%.

Autosomal markers

Basu et al. (2003) emphasize that the combined results from mtDNA, Y-chromosome and autosomal markers suggest that "(1) there is an underlying unity of female lineages in India, indicating that the initial number of female settlers may have been small; (2) the tribal and the caste populations are highly differentiated; (3) the Austro-Asiatic tribals are the earliest settlers in India, providing support to one anthropological hypothesis while refuting some others; (4) a major wave of humans entered India through the northeast; (5) the Tibeto-Burman tribals share considerable genetic commonalities with the Austro-Asiatic tribals, supporting the hypothesis that they may have shared a common habitat in southern China, but the two groups of tribals can be differentiated on the basis of Y-chromosomal haplotypes; (6) the Dravidian tribals were possibly widespread throughout India before the arrival of the Indo-European-speaking nomads, but retreated to southern India to avoid dominance; (7) formation of populations by fission that resulted in founder and drift effects have left their imprints on the genetic structures of contemporary populations; (8) the upper castes show closer genetic affinities with Central Asian populations, although those of southern India are more distant than those of northern India; (9) historical gene flow into India has contributed to a considerable obliteration of genetic histories of contemporary populations so that there is at present no clear congruence of genetic and geographical or sociocultural affinities."

The geographical origins of the major Y-chromosomal lineages of South Asia

The South Asian Y-chromosomal gene pool is characterized by five major lineages: R1a, R2, H, L and J2. Their geographical origins are listed as follows, according to the latest scholarship:
Major South Asian Y-chromosomal lineages: H J2 L R1a R2
Basu et al. (2003) no comment no comment no comment Central Asia no comment
Kivisild et al. (2003) India Western Asia India Southern and Western Asia South-Central Asia
Cordaux et al. (2004) India West or Central Asia Middle Eastern Central Asia South-Central Asia
Sengupta et al. (2006) India The Middle East and Central Asia South India North India North India
Thanseem et al. (2006) India The Levant The Middle East Southern and Central Asia Southern and Central Asia
Sahoo et al. (2006) South Asia The Near East South Asia South or West Asia South Asia
Mirabal et al. (2009) no comment no comment no comment Northwestern India or Central Asia no comment
Zhao et al. (2009) India The Middle East The Middle East Central Asia or West Eurasia Central Asia or West Eurasia
Sharma et al. (2009) no comment no comment no comment South Asia no comment
Thangaraj et al. (2010) South Asia The Near East The Near East South Asia South Asia
Stepanov et al. (2011) no comment no comment no comment Eastern Europe no comment

The ethno-racial composition of the modern Indian population

According to the The Indian Genome Variation Consortium (2005), the population of the subcontinent can be divided into four morphological types: Caucasoids in the north, Mongoloids in the northeast, Australoids in the south and Negritos largely restricted to the Andaman Islands; however, these groups tend to overlap because of admixture. The majority of genetic differences among Indians appears to be distributed along caste lines, rather than along ethnic lines, although genetic differences do exist between predominantly Indo-European-speaking northern and predominantly Dravidian-speaking southern Indian populations, as was also observed by Reich in a recent 2009 study.

In 2008, The Indian Genome Variation Consortium produced another study, this time emphasizing the significant genetic differentiation which exists between Dravidian-speaking, Indo-European-speaking, Tibeto-Burman-speaking and Austro-Asiatic-speaking populations. The researchers write: "Thus, although there are no clear geographical grouping of populations, ethnicity (tribal/nontribal) and language seem to be the major determinants of genetic affinities between the populations of India. This is concordant with an earlier finding based on allele frequencies at blood group, serum protein and enzyme loci (Piazza et al. 1980)." The authors further observe that "it is contented that the Dravidian speakers, now geographically confined to southern India, were more widespread throughout India prior to the arrival of the Indo–European speakers (Thapar 1966). They, possibly after a period of social and genetic admixture with the Indo–Europeans, retreated to southern India, a hypothesis that has been supported by mitochondrial DNA analyses (Basu et al. 2003). Our results showing genetic heterogeneity among the Dravidian speakers further supports the above hypothesis. The Indo–European speakers also exhibit a similar or higher degree of genetic heterogeneity possibly because of different extents of admixture with the indigenous populations over different time periods after their entry into India. It is surprising that in spite of such a high levels of admixtures, the contemporary ethnic groups of India still exhibit high levels of genetic differentiation and substructuring."

"Reconstructing Indian Population History"

In a major study (2009) using over 500,000 biallelic autosomal markers, Reich hypothesized that the modern Indian population was the result of admixture between two genetically divergent ancestral populations dating from the post-Holocene era. These two "reconstructed" ancient populations he termed "Ancestral South Indians" (ASI) and "Ancestral North Indians" (ANI). Ancestral South Indians largely correspond with Dravidian-speaking populations, whereas Ancestral North Indians largely correspond with Indo-European-speaking populations. According to Reich: "ANI ancestry is significantly higher in Indo-European than Dravidian speakers, suggesting that the ancestral ASI may have spoken a Dravidian language before mixing with the ANI."

Furthermore, Reich observes: "It is tempting to assume that the population ancestral to ANI and CEU spoke 'Proto-Indo-European', which has been reconstructed as ancestral to both Sanskrit and European languages, although we cannot be certain without a date for ANI–ASI mixture."

In a more recent session paper by Moorjani et al, a "major ANI-ASI mixture occurred in the ancestors of both northern and southern Indians 1,200-3,500 years ago, overlapping the time when Indo-European languages first began to be spoken in the subcontinent".

Similarly, an earlier study conducted by Watkins et al. (2008) states:
"The historical record documents an influx of Vedic Indo-European-speaking immigrants into northwest India starting at least 3500 years ago. These immigrants spread southward and eastward into an existing agrarian society dominated by Dravidian speakers. With time, a more highly-structured patriarchal caste system developed. India is now broadly characterized by Indo-European (e.g. Hindi, Urdu, and Punjabi) speaking populations found in the central and northern regions and by Dravidian (e.g. Tamil, Telugu, and Kannada) speaking populations in the southern and southeastern regions. ... Although other interpretations may be possible, our data are consistent with a model in which nomadic populations from northwest and central Eurasia intercalated over millennia into an already complex, genetically diverse set of subcontinental populations. As these populations grew, mixed, and expanded, a system of social stratification likely developed in situ, spreading to the Indo-Gangetic plain, and then southward over the Deccan plateau. A strong patrilineal social structure, accompanied by a developing practice of caste endogamy, may have contributed to an asymmetric apportioning of Y-chromosome, autosomal, and to a lesser extent, mtDNA lineages."


The geneticist PP Majumder (2010) has recently argued that the findings of Reich et al. (2009) concerning Indo-Aryan expansion into the Indian subcontinent are in remarkable concordance with previous research using mtDNA and Y-DNA:
"Central Asian populations are supposed to have been major contributors to the Indian gene pool, particularly to the northern Indian gene pool, and the migrants had supposedly moved into India through what is now Afghanistan and Pakistan. Using mitochondrial DNA variation data collated from various studies, we have shown that populations of Central Asia and Pakistan show the lowest coefficient of genetic differentiation with the north Indian populations, a higher differentiation with the south Indian populations, and the highest with the northeast Indian populations. Northern Indian populations are genetically closer to Central Asians than populations of other geographical regions of India... . Consistent with the above findings, a recent study using over 500,000 biallelic autosomal markers has found a north to south gradient of genetic proximity of Indian populations to western Eurasians. This feature is likely related to the proportions of ancestry derived from the western Eurasian gene pool, which, as this study has shown, is greater in populations inhabiting northern India than those inhabiting southern India. In general, the Central Asian populations are genetically closer to the higher-ranking caste populations than to the middle- or lower-ranking caste populations... . Among the higher-ranking caste populations, those of northern India are, however, genetically much closer than those of southern India. Phylogenetic analysis of Y-chromosomal data collated from various sources yielded a similar picture. Higher-ranking caste populations have been the torch-bearers of the Hindu caste system that was formalized by the Indo-European immigrants. It is likely, therefore, that there was a greater proportion of admixture between higher-ranking caste populations and Indo-Europeans. The fact that high-ranked caste populations inhabiting southern India do not exhibit as much affinity with central Asian populations as those of northern India may be explained by the recent finding that the south Indian, Dravidian speaking, populations may have admixed with north Indian populations bearing ancestral signatures of the western Eurasian gene pool more recently."


The author summarizes his findings by stating that:
"Within India, consistent with social history, extant populations inhabiting northern regions show closer affinities with Indo-European speaking populations of central Asia that those inhabiting southern regions. Extant southern Indian populations may have been derived from early colonizers arriving from Africa along the southern exit route. The higher-ranked caste populations, who were the torch-bearers of Hindu rituals, show closer affinities with central Asian, Indo-European speaking, populations. ..."

See also

  • Y-DNA haplogroups in South Asian populations
    Y-DNA haplogroups in South Asian populations
    Listed here are notable groups and populations from South Asia by human Y-chromosome DNA haplogroups based on relevant studies. The samples are taken from individuals identified with linguistic designations , the third column gives the amount of total Sample Size studied, and the other columns give...

  • Ethnic groups of South Asia
    Ethnic groups of South Asia
    The ethno-linguistic composition of the population of South Asia, that is the nations of India, Pakistan, Bangladesh, Nepal, Bhutan, Maldives and Sri Lanka is highly diverse. The majority of the population fall within two large Linguistic groups, Indo-Aryan and Dravidian.These groups are further...

  • Early human migrations
    Early human migrations
    Early human migrations began when Homo erectus first migrated out of Africa over the Levantine corridor and Horn of Africa to Eurasia about 1.8 million years ago, a migration probably sparked by the development of language Early human migrations began when Homo erectus first migrated out of Africa...

  • Peopling of India
    Peopling of India
    The peopling of India is a contentious area of research and discourse, due to the debate on topics such as the Indo-Aryan migration hypothesis. Some anthropologists hypothesize that the region was settled by multiple human migrations over tens of millennia, which makes it even harder to select...

  • Archaeogenetics
    Archaeogenetics
    Archaeogenetics, a term coined by Colin Renfrew, refers to the application of the techniques of molecular population genetics to the study of the human past. This can involve:*the analysis of DNA recovered from archaeological remains, i.e...

  • Genetic history of indigenous peoples of the Americas

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK