search this blog

Thursday, September 4, 2014

Ancient North Eurasian (ANE) admixture across Europe & Asia


Another update: ANE is the primary cause of west to east genetic differentiation within West Eurasia

...

This is an update of a supervised ADMIXTURE analysis that I ran earlier this year looking at ANE levels throughout Asia, the results of which I posted at my other blog (see here). Anyone wanna make a map?

ANE admixture across Europe & Asia spreadsheet

My claim is that these estimates are more accurate than those we've seen recently in scientific literature. Obviously I'm referring here to Lazaridis et al. 2013/14 (see here). That's not to say that the authors of this paper don't know what they're doing. Clearly they do, but at the fine-scale there's usually room for improvement no matter who you are.

For instance, in their paper in table S14.9 they list the Basques (in fact, French Basques) as 11.4% ANE, which sounds reasonable, although perhaps a little too high considering they admit that this population can be modeled as 0% ANE. On the other hand, they estimate the "North Spanish" to be 16.3% ANE.

Now, this reference set is actually from the 1000 Genomes project, where it's listed as Spaniards from Pais Vasco (ie. Basque Country). Essentially, what this means is that these are Basques from Spain. So why would Basques from France carry only 11.4% ANE, and Basques from Spain a whopping 16.3%? Not only that, but according to Lazaridis et al., these "North Spanish" also can be modeled as 0% ANE.

Obviously, something's not quite right there. Indeed, in my spreadsheet, the very same French Basques are listed as 7.4% ANE, while the Pais Vasco Spaniards as just over 8%. Call me crazy, and many do, but I think these results actually make good sense.

By the way, I made ten synthetic samples from the ANE allele frequencies from this test, and remarkably, in all of the analyses I've ran so far they behaved very much like MA-1 or Mal'ta boy, the main ANE proxy. Below, for example, is a Principal Component Analysis (PCA) of West Eurasia featuring these individuals. The result is very similar to that I obtained with Mal'ta boy (see here).




The synthetic ANE samples are available here. Feel free to play around with them, and if you do, please let me know what you discover.

As some regular visitors already know, I'm currently designing a new test for GEDmatch that will include various ancient components like ANE. Unfortunately, it might be a while before it's ready, simply because I want it to be as accurate as possible.

See also...

Eurogenes ANE K7

Corded Ware Culture linked to the spread of ANE across Europe


64 comments:

Davidski said...

Jackson, you got 15.72% ANE in this test.

Tesmos, you got 16.27%.

Shaikorth said...

Treemix picked a small MA-1 admixture edge to Japanese and this shows a small but consistent amount in them too. That's certainly encouraging.

Odd that North Han show none though, they should definitely have more than She.

Davidski said...

Some of the North Han do show a little ANE. Scroll down to the last few samples.

Shaikorth said...

Yeah that happened in the previous ANE test too. Maybe there are some southern Han in the sample or something else is going on. Even Dai could have a tiny amount of ANE but the regional drift may make that impossible to detect with ADMIXTURE.

South Asian numbers are high as expected, but I wonder what's the reason for Piramalai showing higher ANE than Selkups in ADMIXTURE even though they are not nearly as close in f3 or IBS.

Davidski said...

South Asian or maybe rather Southeast Asian ancestry (Malayan-like stuff) really seems to dampen f3 and IBS affinity to MA-1. Much like Middle Eastern ancestry.

I actually had an ASI-like cluster in this test, and the Piramalai scored about 27% in it, while the Malayans around 50%.

East Asian ancestry doesn't have the same effect.

Helgenes50 said...

I don't know whether that's correct, by using the mean of Cantabrians and the mean of SE English, I get, in my case the same result as in the first calculator for ANE, ie 13,87 %

Davidski said...

I'll check your score tomorrow. But the problem with the 23andMe files after V1 and V2 is that they have poor marker overlap with most of my tests.

Tesmos said...

Davidski,

Awesome thank you for the results. What do Northern Dutch/Germans score on average?

Davidski said...

I'm not sure yet. I'll try and run more samples this weekend.

Tesmos said...

Ok,btw 16.27% is rougly 1% higher than the ANE/WHG/EEF test for Europeans.( I had 15.33%) Looks like this new test is more accurate. Belarusians now have significant higher ANE then the ones @ Lazaridis.

Davidski said...

Yeah, they show Belorussians as 15% and North Spanish as 16%.

I think they need to recalibrate their test.

Seinundzeit said...

David,

Wow, this seems very accurate. I honestly think you've latched onto solid ANE estimates, the results finally make sense for all of the various populations (and not only for South Asians, or only for Europeans, or only for Siberians, etc).

Out of curiosity, what did I score for the ASI-like cluster? Thanks in advance.

Davidski said...

The ASI was 14.77%. The Pathan average for that was 15%.

Seinundzeit said...

Cool!

One final question, what was Pashtun10_17Af's score for ASI?

jackson_montgomery_devoni said...

Wow this truly is amazing David!

You said this in your first post here.

''Jackson, you got 15.72% ANE in this test.''

Are you referring to me (CA1)?

barakobama said...

David do we just send you 23andme raw data to get an ANE score?

Davidski said...

Sein,

Pashtun10_17Af was 12.17% ASI. But he also had a couple per cent of East Asian admixture, slightly higher than other Pashtuns, including yourself (around 1%).

Jackson,

Yeah, I meant you CA1.

Barry,

I think it'll be better if I can get this running at GEDmatch.

jackson_montgomery_devoni said...

Thanks David. So it looks like when it comes to West Eurasians it is Sardinians who have pretty much zero ANE ancestry. Bedouins of course are not far behind. North Africans also do not really seem to have much at all.

Seinundzeit said...

David,

Thanks! This samples's result's make perfect sense.

Seinundzeit said...

Also, this would be a great addition to GEDMatch.

DarthVadent2 said...

Davidski, when do you think the Eurogenes K = 6 will be available? Also, in regards to ANE in Bedouins. Do you think that the Bedouins inherited this ANE admixture indirectly through interbreeding with West Asians further north?

Davidski said...

There are two distinct Bedouin groups; one with low levels of ANE, and another with no detectable ANE.

I suppose the former group got their ANE from the same source as the Palestinians, Egyptians, Jordanians, and so on. But I don't know what this source was precisely. Maybe groups like the Hittites, Hyksos and Hurrians, and/or via population movements during the Muslim expansions?

We can only speculate, but it's interesting to note that religious minorities show lower ANE than Muslim groups. For instance, check out the differences between Iranians and Iranian Jews, and Lebanese Muslims and Lebanese Christians. They're small, but clearly not a coincidence.

Hopefully, the K6 will be ready early next week.

barakobama said...

Some of you might be interested in this.

http://www.theapricity.com/forum/showthread.php?138383-preliminary-Neolithic-Balkans-Y-DNA-results(Starcevo-culture)-and-other-ISBA-2014-symposium&p=2932797#post2932797

mtDNA and skin-eye color SNPs taken from east and west Europe ranging from the Upper Palaeolithic-bronze age. It's interesting that they mention a mtDNA divide between east and west Europe before the Neolithic.

They mentioned that known light skin alleles existed well before the Neolithic but were not dominate, and they seem to suggest west Europeans were very light eyed. This is already old news, but it brings the same info back several more thousand years. They mention that east Europeans(Maybe some mostly ANE) had a higher percentage of light skin mutations and darker eyes than west Europeans. The eye thing is no surprise because of results from copper Indo Europeans, but the skin thing is.

Pigmentation in west Eurasia and central Asia probably didn't change at all from the Upper Palaeolithic-Neolithic. Then in the European bronze age a lot changed. West Asia though has probably been the same in that respect since the early Neolithic at least.

Matt said...

East Asian ancestry doesn't have the same effect.

Interesting of itself. Hard to see how this is possible unless East Asian shares more drift with ANE than ASI does in some way. Yet ANE and East Asian are supposed to be unadmixed descendants of different clades, and shared drift stats are supposed to be symmetrical for all ENA (whether ASI or East Asian). That's what it seems like from Onge, and the normal story is that ASI will turn out to be is a member of a clade with Onge (with less drift) once all its admixtures are sorted out.

If you look at PCA with only Siberian, East Asian and Southeast Asian, does synthetic ANE sit at the same place as MA-1? Same question for a PCA of Siberian, East Asian, Southeast Asian and Native American.

Presumably the Malays with some trace amounts of ANE have an Indian line of ancestry (and Cambodians the same, except perhaps not so recent).

Population means of interest (to me) here: Ket - 30%, Selkup - 28%, Yakut - 18%, Japanese - 3%, Pathan 33%, Piramalai - 30%, Burusho - 35%, Tabassaran - 27%, Burmese - 6%, Cambodian - 2%. Koryaks and Chukchi could be interesting as well.

Davidski said...

The problem isn't ANE, it's ASI, because I don't have an ancient ASI genome to design the test with.

I think there's something Basal Eurasian, Mediterranean and/or African about it.

Davidski said...

Matt,

Chukchi are getting 25% and Koryaks 22%.

Helgenes50,

Your score is 13.66%.

Tesmos,

North Dutch are on average 16%, while South Dutch only 14%. Also, some Frisians are getting as high as 18%.

Shaikorth said...

So, it looks like that populations that appear in this calculator as ANE + "East Asian" are closer to MA-1 in f3/IBS than populations with equivalent amount of ANE but also ASI.

Do She, Han and Dai appear as ASI + East Asian hybrids and if they do, how far north does ASI extend?

Helgenes50 said...

Thanks for my result

By using percentages of the oracle K13, as for the first calculator( WHG EEF ANE) my result was close 13.87 %. Can we use this method every time? or instead use the oracle of K15 ?

Davidski said...

It looks like the ANE estimates arrived at via the K13 are very similar to these, give or take a percentage point. Very impressed with that actually.

As for the ASI component in this test, it's not really all that important. It was just there to help isolate ANE, along with a bunch of other clusters.

It's actually found all over Eurasia, but mostly at noise levels west of Iran and north of the Altai. In Japan it's at a steady 10%, which might be interesting.

Seinundzeit said...

David,

That's quite interesting. If possible, could you check to see what the Iranian and Kurdish averages were for ASI (just the general ballpark for these two groups)?

I guess that could map unto a Jomon-like signal for the Japanese? Very interesting stuff.

Seinundzeit said...

Also, since the ANE-cluster here is so solid/robust (acts like MA1 in PCA, appears in the right percentages for all populations), will you use the ANE allele frequencies from this run, in the new GEDMatch test?

Matt said...

Davidski: The problem isn't ANE, it's ASI, because I don't have an ancient ASI genome to design the test with.

I think there's something Basal Eurasian, Mediterranean and/or African about it.


Yeah, those are all interesting possibilities - a Basal Eurasian /African contribution would both would to depress shared drift as long as either would have to be both an outgroup to WHG, ANE and ENA. It probably couldn't be too much like the Basal Eurasian we know of in the Middle East, as that would increase shared drift with high BE populations though.

Many people (like Maju?) suspect India to harbour either the most basal Eurasian ancestry (even more basal what we've been calling basal in the Middle East) or a variety of basal Eurasian directly ancestral to East and West Eurasians (like node X in Laziridis's model, a side branch leading towards which would still be an outgroup). Basically, a node off of "non_African" from Laziridis, of some sort. This could be evidence in that direction.

In contradiction to this, ASI has previously appeared as having definite Onge / ENA affinities in Reich and Moorjani's previous work (so should be a member of the ENA clade).... but perhaps this is a consequence of ineffectively separating out West Eurasian by affinities from later admixture (maybe because the reference populations used for the model were inexact), causing a component with an even balance of East/West Eurasian affinities to spuriously lose some West Eurasian affinities and move in the East direction.

Tesmos said...

Davidski,

Thank you for the averages. Makes sense that some Frisians get 18%. They are very similar(barely distinguishable) to Danes.

Davidski said...

Sein,

The Kurds have just over 4% of this ASI, while the Iranians almost 6%.

Onur said...

David, what about Turks and Armenians?

barakobama said...

Davidski,

Maybe you should not include anyone with known south Asian ancestry in this new test(Maybe that would be to exclusive and take out alot of west Asians), since it's not known what creates south Asian-specific ancestry. I'm just worried ASI will screw up the results if people as un south Asian as Japs are scoring 10%. You could also create similar tests for south Asians.

Davidski said...

Malayans are scoring around 50% of this ASI. So I'm not sure you can say that the Japanese shouldn't be showing any it. Didn't some studies find southeast Asian admixture in Japan? I seem to remember that at least one recent study did.

Btw, Armenians are around 1-2%, which looks like noise. Turks range from 2-6%.

Seinundzeit said...

David,

Thanks! That's quite interesting, so there isn't a world of difference between western Iranians (Kurds, the various peoples of Iran) and eastern Iranians (Pashtuns, Baloch, the Pamiri peoples) when it comes to ASI. I guess the primary differentiating factor between eastern and western Iranians lies in the levels of ANE admixture.

Barry,

The cluster in question wasn't South Asian, it peaked in the Malay. It's more like Southeast Asian. Regardless, this admixture run was quite light on South Asian/South Central Asian samples. I think there were only around 10-15 South Asian/South Central Asian populations. By contrast, there were a lot of European, West Asian, Southwest Asian, North African, East African, West African, Siberian, northern Central Asian, and East Asian populations. Basically, there wasn't much room for South Asians to develop their own cluster, so your concern isn't that solid.

Gui S said...

Are you able to compute my ANE score? I am quite curious.
Also do you want this mapped?

Andrés said...

I've made a quick map.

The base map I took from google - turned out to be a bad idea because of my total ignorance of russian geography. I did my best to assign a dot to each ethnic groups. I couldn't solve some things like the difference between "Kirgiz" people and "Kirghiz" people. But it is OK as a quick visualization.

https://docs.google.com/file/d/0B6-bZT4AwfTPMVFxa1FDZWNxcUE

All ethnic groups were sorted and divided into 4 quartiles. The upper 2 quartiles were divided again into halves. So the first two colors capture 25% each and the color scale is:

Yellow < 7.9% < light orange < 14.3% < dark orange < 17.9% < brown < 20% < scarlet < 23.2% < red

Davidski said...

Gui,

I'll post your score here later today.

And yes, a spatial map would be great.

Andres,

Thanks. The Kirgiz should be listed as Kirghiz. I'll fix that now.

Onur said...

One more question David. Are you sure your ASI level estimations for Turks accurate? Turks usually show noise levels or a bit above-noise levels of South Asian component and in levels lower than those of both Iranians and Kurds in ADMIXTURE analyses. Can the ASI levels you estimated for Turks be actually mostly capturing Mongoloid ancestry instead of ASI ancestry?

Davidski said...

There's no such thing as "Mongoloid" ancestry in this test. East Asian groups often show very different levels of ANE, ASI and ENA, with Mongolians showing all three at above noise levels.

Onur said...

So could you re-answer my question replacing my word "Mongoloid" with "ENA"?

Davidski said...

No, the test isn't confusing ASI for ENA in Turks.

Turks carry this particular ASI at above noise levels because they have ancestry from East Eurasia via Central Asia, where most groups are mixtures of ENA, ASI and also often ANE.

Onur said...

So most of the ASI you estimated for Turks is related to the East Eurasian components in ADMIXTURE analyses rather than to the South Asian components, right?

Davidski said...

That's right. The Turkish ASI in this test doesn't come from South India but from East Central Asia.

Onur said...

That's right. The Turkish ASI in this test doesn't come from South India but from East Central Asia.

That is the opposite of the situation in Iranians and Kurds.

Davidski said...

Right, because the so called ASI in Iranians is from South Asian tribals, and that in Turks from the Altai...or maybe even Japan?

Davidski said...

Gui,

Your ANE score in this test is 11.9%.

Matt said...

Woah, re: Malayans % of ASI, be interesting to see the tree dendrogram of these components when the full test is done. An ASI component that peaks in Southeast Asia, is present in both India and sub-Altaic Northeast Asia and seems like it might to fall as the outgroup to an East Asian component, ANE and WHG seems unusual (some echoes of Cavalli-Sforza's weird dendrograms showing a split with West Eurasians and Northeast Asians on one side and Southeast Asians and Australasians on the other).

AP said...

Aren't the Malayans from southern India?

Davidski said...

They're from Singapore.

Gui S said...

Hey, thanks for the results! Nothing unexpected, it also indicates that excess Baltic and Eastern Euro I usually get in tests at higher Ks is not related to an excess ANE.

I'll make a map very soon.

Just to clear up, the Malayans are actually not Singaporean Malays. Singaporean Malays are tagged either as MSI or Malays. Malayans are a South Indian group. http://en.wikipedia.org/wiki/Malayan_tribe

Davidski said...

OK, I see. Well, this ASI component peaks among the Malays from Singapore at 59%. But maybe it would reach almost 100% among aboriginal samples from Malaysia, Indonesia and the Philippines?

Gill said...

That might explain why South Indians are randomly getting above noise levels of Mediterranean or extra basal-type Caucasian (Abhkasian/Georgian-like or Bedouin/SW-Asian like) in other calculators. But that corresponds to actual results in 23andMe's Countries of Ancestry. South/Southeast Indians with large segment matches with Irish people. I guess just coincidence?

barakobama said...

Davidski,

Should the test be ready by next Friday? What type of scores are ANF and WEF getting? Does ANF seem to pretty much be basal Eurasian?

Davidski said...

It'll be ready today, but I don't know when GEDmatch will get it working. If they're taking ages then I'll post the calc files online.

The ANE component is actually just called ANE, and very similar to the ANE in the spreadsheet above.

Davidski said...

Oh wait, you were asking about the farmer component. Yeah, it'll be very close to Basal Eurasian (ie. around 44% of EEF).

Gill said...

David, can you just post the calc files online? GEDmatch only allows 5 users at a time to run Admixture, it'll get pretty busy once it goes online. I have almost a dozen kits I can check on my computer at home with the calc files to spare them the trouble of using GEDmatch.

Davidski said...

OK, I'll post the calc files at my other blog when I finish the write up for this test.

Daniel Szelkey said...

I really need a ANE spreadsheet for various American tribes.

Davidski said...

Gui,

Actually, I'll put together some more stats for India and the Americas today for that spatial map, because the current spreadsheet is lacking them.

Davidski said...

Here are more results for populations from South India, Siberia and the Americas.

https://docs.google.com/spreadsheets/d/1JDiMSWEiVUgBttmxudWS3jJbJh6KnJlRkyour6bhLps/edit?usp=sharing