search this blog


Tuesday, July 29, 2014

Analysis of Upper Paleolithic Siberian forager Afontova Gora-2

Apparently, this 15,000 year-old genome from Central Siberia is heavily contaminated with modern DNA (see section SI 5.2.3. in Raghavan et al. 2013). However, apart from MA-1, it's the only Ancient North Eurasian (ANE) sample available right now, so I thought I'd take a closer look at it.

The shared drift statistics using f3(Mbuti;AG-2,Test) do suggest contamination from a present-day Eastern European source, with, for instance, Ukrainians from Lviv showing an unexpectedly strong signal (third on the list below just behind Pima Indians). This makes sense since AG-2 was probably mainly handled by Slavic-speaking Soviet archaeologists and museum staff.

Shared drift with AG-2 (spreadsheet)

Indeed, in the Eurogenes K15 test, the Baltic component is the most important for AG-2, and this component is modal among Balto-Slavic populations. However, AG-2 fails to register any Mediterranean-specific admixture. At the very least, this is interesting, because all present-day Europeans show this influence. In fact, out of the four K15 components typical of the Near East, only the West Asian component appears for AG-2. This component actually peaks in the Caucasus, where today ANE reaches its highest levels in West Eurasia.

Eurogenes K15 results for AG-2

North_Sea 11.3
Atlantic 0.01
Baltic 22.83
Eastern_Euro 20.53
West_Med 0
West_Asian 4.63
East_Med 0
Red_Sea 0
South_Asian 13.9
Southeast_Asian 0
Siberian 5.97
Amerindian 16.07
Oceanian 4.77
Northeast_African 0
Sub-Saharan 0

4 Ancestors Oracle results based on the K15 ancestry proportions suggest that AG-2 might simply be a more westerly ANE sample than MA-1, perhaps with some European forager ancestry. Below are a few examples of the best population approximations; note the strong showing by StoraFörvar11, a Mesolithic genome from near Gotland, Sweden. The full list can be seen here.

1 Brahmin_UP+North_Amerindian+StoraFörvar11+StoraFörvar11 @ 8.364493
2 Burusho+North_Amerindian+StoraFörvar11+StoraFörvar11 @ 8.411899
3 MA-1+MA-1+StoraFörvar11+Tatar @ 8.427561
4 Kshatriya+North_Amerindian+StoraFörvar11+StoraFörvar11 @ 8.437549
5 Gujarati+North_Amerindian+StoraFörvar11+StoraFörvar11 @ 8.45127

Keep in mind, however, that I was only able to use around 13K SNPs that overlapped with my dataset for all of the tests here. Perhaps these markers were much less affected by contamination than the rest? In any case, here are three Principal Component Analyses (PCA) to finish things off. Again, AG-2 basically looks like the genome of a late ANE survivor with a solid contribution from indigenous European foragers. Hopefully this can be confirmed or debunked in the near future with a much higher quality sequence of its genome.

See also...

Analysis of Mesolithic Swedish forager StoraFörvar11

Wednesday, March 26, 2014

The story of R1a: the academics flounder on

There's been a lot of horseshit published over the years about Y-chromosome haplogroup R1a, which just happens to be my haplogroup. That includes academic papers in journals like PLoS ONE and Nature. My advice is, take all of that stuff with a very large pinch of salt and just look here for updates.

Indeed, a new paper on the phylogeography of R1a appeared at the Nature website today: Underhill et al. 2014. It's actually a much better effort than anything else on the topic at academic level thus far, but certainly not without issues.

For instance, the authors failed to include two well known and very important R1a subclades in their analysis: the Northwest European-specific R1a-CTS4385 and the East and Central European-specific R1a-Z280. As a result, the former is lumped with R1a-M417* and the latter with R1a-Z282*. In fact, Z280 is shown to be above Z282 in the topology of R1a-M420 (see Figure 1 here), which is plain wrong. These are major oversights and mean that this study is not a very useful resource as far as the phylogeography of European R1a is concerned.

But the paper does show a couple of interesting things. For instance, the maps below offer the best illustration to date of the dichotomy between the European-specific R1a-Z282 and Asian-specific R1a-Z93.

However, these are very closely related subclades, sharing the Z645 mutation (unfortunately not mentioned in the paper), and both reaching high frequencies among Indo-European speakers. It's therefore plausible that groups carrying these markers expanded to the west and east from a zone between their current hotspots, possibly the Volga-Ural region, rather recently.

Indeed, these migrations had to have happened after 4800-6800 YBP, which is the age of R1a-M417 reported by Underhill et al., and backed up by estimates from genetic genealogists using, among other things, complete R1a sequences (see here). In other words, the rapid expansions of R1a-Z282 and R1a-Z93 appear to have taken place from more or less the same region during the generally accepted early Indo-European timeframe, making them excellent candidates for paternal markers of the early Indo-European dispersals.

At the same time, the paucity of R1a-Z93 and derived lineages in Europe, including Eastern Europe, suggests that historic migrations originating in East and Central Asia, like those of the early Turks, had a negligible effect on the paternal ancestry of modern Europeans. This shows very clearly on the PCA in Figure 4 (see here).


Underhill et al., The phylogenetic and geographic structure of Y-chromosome haplogroup R1a, European Journal of Human Genetics, advance online publication, 26 March 2014; doi:10.1038/ejhg.2014.50

See also...

R1a-Z93 from Bronze Age Mongolia

Afghan Hindu Kush: a genetic sink

Saturday, March 15, 2014

PCA of ancient European mtDNA

The recent Wilde et al. paper on the ancient DNA of Eastern European steppe nomads included mitochondrial DNA (mtDNA) data for just over 60 of the studied individuals. Below is a Principal Component Analysis (PCA) featuring these samples, marked collectively as KGU, alongside the dataset from last year's Brandt et al. study on the genetic origins of Central Europeans.

Note that KGU falls closest to the Bernburg (BEC) and Unetice (UC) samples from Neolithic and Bronze Age eastern Germany, respectively. This is probably because all of these groups have similar levels of mtDNA haplogroups U5a and H. Moreover, UC is thought to be an Indo-European archaeological culture with origins in Eastern Europe. On the other hand, Brandt et al. hypothesized that BEC might have been of Scandinavian origin.

The Central European metapopulation (CEM) is composed of present-day individuals from Austria, Germany, Poland and the Czech Republic. Its position on the PCA plot suggests to me that modern Central Europeans are largely derived of Kurgan nomads, Bell Beakers from Iberia (BBC), and remnants of Neolithic farmers from the Near East, at least in terms of maternal ancestry.

In other words, I'd say the result correlates well with the findings of Brandt et al., who posited that long-range migrations from eastern and western Europe into the heart of the continent, particularly during the late Neolithic, played an important role in the formation of the modern Central European mtDNA gene pool.

Citations and credits...

Thanks to Eurogenes Project member PL16 for the PCA

Wilde et al., Direct evidence for positive selection of skin, hair, and eye pigmentation in Europeans during the last 5,000 y, PNAS, Published online before print on March 10, 2014, DO:I10.1073/pnas.1316513111

Guido Brandt, Wolfgang Haak et al., Ancient DNA Reveals Key Stages in the Formation of Central European Mitochondrial Genetic Diversity, Science 11 October 2013: Vol. 342 no. 6155 pp. 257-261 DOI: 10.1126/science.1241844

See also...

Extreme positive selection for light skin, hair and eyes on the Pontic-Caspian steppe...or not

Sunday, February 23, 2014

Genetic affinities of Estonian Poles

The Estonian Biocentre has a new genotype dataset available from the recently released "Khazar" preprint (see here). The samples include Poles from Estonia, so I ran a PCA to see whether there was a clear difference between them and their ethnic kin from Poland in terms of genome-wide genetic structure. This doesn't appear to be the case, except for a few individuals who probably have significant Estonian and/or northwest Russian ancestry (the several northernmost and easternmost Polish_Estonian samples on the plots below). It's an interesting result, considering that, as far as I know, most Estonian Poles are not of recent Polish origin, but have roots in the East Baltic dating back to the Polish-ruled Duchy of Livonia of the 1600s. Please note, the plots were rotated and stretched horizontally to fit with geography.


Behar, Doron M.; Metspalu, Mait; Baran, Yael; Kopelman, Naama M.; Yunusbayev, Bayazit; Gladstein, Ariella; Tzur, Shay; Sahakyan, Havhannes; Bahmanimehr, Ardeshir; Yepiskoposyan, Levon; Tambets, Kristiina; Khusnutdinova, Elza K.; Kusniarevich, Aljona; Balanovsky, Oleg; Balanovsky, Elena; Kovacevic, Lejla; Marjanovic, Damir; Mihailov, Evelin; Kouvatsi, Anastasia; Traintaphyllidis, Costas; King, Roy J.; Semino, Ornella; Torroni, Anotonio; Hammer, Michael F.; Metspalu, Ene; Skorecki, Karl; Rosset, Saharon; Halperin, Eran; Villems, Richard; and Rosenberg, Noah A., No Evidence from Genome-Wide Data of a Khazar Origin for the Ashkenazi Jews (2013). Human Biology Open Access Pre-Prints. Paper 41.

Monday, January 27, 2014

Poles more indigenous to Europe than Germans

This has actually been obvious for a while now, thanks to both modern and ancient DNA. But the figure below from the new Olalde et al. paper on the complete genome of a Mesolithic hunter-gatherer from Iberia illustrates it more effectively than anything else I've seen to date. Note that the Polish reference set (PL) shows significantly higher allele sharing with the ancient Iberian, La Brana 1, than do Germans (DE). In fact, only Swedes (SE) manage to better Poles in this regard. But it's also worth noting that Poles show the highest allele sharing with the two partial genomic sequences of Neolithic hunter-gatherers from Gotland, Ajv70 and Ajv52.

On the other hand, compared to Poles, Germans clearly show higher allele sharing with Gok4, the Neolithic farmer from Southern Sweden, and Otzi the Iceman from the Copper Age Tyrolean Alps. Unlike the hunter-gatherers, who are genetically more Northern European than any Europeans alive today, these ancient samples are more Mediterranean, and indeed more Near Eastern, than most present-day Europeans, which is something that can be seen clearly on the main Principal Component Analysis (PCA) from Olalde et al. below. This suggests that most of their ancestors arrived in Europe from the Near East during the Neolithic.

It's an intriguing outcome between these two large neighbouring European countries, but perhaps easily explained by geography and climate? Germany is situated west of Poland, so it has a warmer climate, and thus its territory was more heavily settled by early farmers from the Mediterranean Basin during the Neolithic. Moreover, much of what is now Germany was part of the Roman Empire, which might have facilitated gene flow between the ancestors of present-day Germans and southern Europeans.

Poles, on the other hand, show stronger genetic links to Baltic populations, especially Lithuanians and Estonians, who are arguably the most Mesolithic-like Europeans alive today (see here). In fact, if they were present on the graphs above, they'd probably easily top the allele-sharing list with La Brana 1 and all of the hunter-gatherers from Gotland. This might be due to the almost impenetrable primeval forests that once covered the areas just south and east of the Baltic, as well as the relatively cold climate in these regions.


Olalde et al., Derived immune and ancestral pigmentation alleles in a 7,000-year-old Mesolithic European, Nature (2014), doi:10.1038/nature12960

See also...

The really old Europe is mostly in Eastern Europe

Prehistoric Scandinavians genetically most similar to modern Poles

Mesolithic genome from Spain reveals markers for blue eyes, dark skin and Y-haplogroup C6

Tuesday, December 17, 2013

Near Eastern origin of Ashkenazi Levite R1a

Over at Nature Communications, Rootsi et al. report on a newly discovered Ashkenazi-specific subclade of R1a, defined by the M582 mutation. They argue that it's a marker of Near Eastern origin, and based on the comprehensive data in their paper, I'd say they're correct. However, it's important to note that this doesn't preclude an ultimate Eastern European or Central Asian source of M582 in the Near East. For instance, an R1a mutation ancestral to M582 might have been introduced by the proto-Iranians from the steppe into what is now Iran during the early Indo-European dispersals. Indeed, that's actually what Figure 1a from the study suggests (phylogenetic tree of R1a below). The paper is open access, but here are a few quotes anyway:

Haplogroup R1a-M582 was only sporadically observed in Europe, the Diaspora residence of Ashkenazi Jews. Notably, it was not identified among 2,149 samples (including 922 R1a-M198) of non-Jews from East Europe, where the Ashkenazi Jewish community flourished in recent centuries (Table 1).


Within 1,068 West/North European samples (106 R1a-M198), M582 was observed in just one German sample, and among 3,756 Central/South European samples (710 R1a-M198), it was found only in one Hungarian and one Slovakian sample (Table 1).


Among 3,739 Near Eastern samples (303 R1a-M198), R1a-M582 was identified in various populations, with the highest frequency occurring within Iranians collected from the southeastern Kerman population who self-identified as Persians, northwestern Iranian Azeri and in Cilician Anatolian Kurds, at 2.86%, 2.50% and 2.83%, respectively (Table 1). In contrast, among 2,164 samples from the Caucasus (211 R1a-M198), R1a-M582 was found in just one Nogay sample (Table 1).


Considering the historical records of Ashkenazi Jews, three potential geographic sources should be considered: the Near East, which was the geographic location for the ancient Hebrews; Europe, which was the residence of the Ashkenazi Jewish Diaspora and the region in which they evolved for nearly two millennia; and the region overlapping with the no longer extant mid-11th Century Khazarian Khaganate, whose ruling class has been suggested to have converted to Judaism18. Our data render the latter source highly unlikely since the Khazarian Khaganate overlapped with the Northern Pontic-Caspian steppe and the North Caucasus region, in which just one Nogay sample carried the R1a-M582 haplogroup (Table 1). Furthermore, the Nogays, formerly a powerful Kipchak Turkic-speaking nomadic confederation, are relatively recent inhabitants of the Caucasus, and the STR haplotype of the sole R1a-M582 Nogay sample lies outside of the Levite cluster. Had the Caucasus region been the source for the Ashkenazi modal lineage, we likely would have found R1a-M582 Y-chromosomes in some of its 20 local populations examined in our sample of more than 2,000 Y-chromosomes (Table 1).


Near Eastern populations are the only populations in which haplogroup R1a-M582 was found at significant frequencies (Table 1). Moreover, the representative samples displayed substantial diversity even within this geographic region (Fig. 1b). Higher frequencies and diversities often suggest lineage autochthony.


Rootsi, S. et al. Phylogenetic applications of whole Y-chromosome sequences and the Near Eastern origin of Ashkenazi Levites. Nat. Commun. 4:2928 doi: 10.1038/ncomms3928 (2013).

See also...

R1a and R1b as markers of the Proto-Indo-European expansion: a review of ancient DNA evidence

Ancient Siberians carrying R1a1 had light eyes - take 2

New R1a1a tree