Friday, August 14, 2015

Testing for genetic continuity in Poland from the Bronze Age to the present

The recent Allentoft et al. paper on the ancient genomics of Eurasia featured an Early Bronze Age Corded Ware/proto-Unetice individual belonging to Y-haplogroup R1a. His remains came from a kurgan burial in present-day Greater Poland, or Wielkopolska, known as one of the four Pyramids of Wielkopolska.

Of course, R1a is by far the most common Y-haplogroup in Poland today, and Greater Poland is generally accepted to be the cradle of the Polish nation.

It's tempting to think that all of this isn't just a happy coincidence, and that this kurgan man and/or his close relatives are the ancestors of modern-day Poles. Considering that we have some of his genome, can we actually test this hypothesis?

Unfortunately, the sequence by itself is too limited to allow such a high resolution analysis. However, the Allentoft et al. dataset includes six other Bronze Age samples from Poland; one other Corded Ware individual from Greater Poland, and five Unetice individuals from Silesia. Thus, it's possible to combine these samples and at least run a preliminary analysis comparing them to present-day Europeans, including Poles, to test their affinities.

The only reliable way to do this is to use formal statistics, and specifically D-statistics. That's because, unlike model-based analyses, D-stats ignore recent genetic drift and, unlike f3-stats, they're able to discriminate correctly at a very fine scale between samples with somewhat different numbers of markers. Below are two sets of results of the form D(Outgroup, PopulationTest) (Population1, Population2).

D(Ju_hoan_North, Poland_Bronze_Age) (BedouinB, European)

D(Ju_hoan_North, Poland_Bronze_Age) (Polish, European)

Basically, what the results show is that western Poland was inhabited by a very northern people during the Bronze Age. They were similar to present-day Balts, Scandinavians, Irish, and Poles.

Indeed, in these sorts of tests small Northern European countries tend to get the best scores with most prehistoric Europeans. I believe that this isn't just because of shared ancestry, but also relative isolation and homogeneity. So the fact that Poland is the only really big country at the top of the list above might be very important.

That's pretty much it for now. As far as I can see, there's nothing to suggest that present-day Poles can't be the direct descendants of these ancients. But as I say, this was a preliminary analysis and a work in progress. I'll revisit this issue when more samples come in. By the way, I also ran a bunch of other D-stats that might be of interest.

D(Ju_hoan_North, Bell_Beaker) (BedouinB, European)

D(Ju_hoan_North, Corded_Ware) (BedouinB, European)

D(Ju_hoan_North, EHG) (BedouinB, European)

D(Ju_hoan_North, Hungary_BA) (BedouinB, European)

D(Ju_hoan_North, Loschbour) (BedouinB, European)

D(Ju_hoan_North, Motala_HG) (BedouinB, European)

D(Ju_hoan_North, Stuttgart) (BedouinB, European)

D(Ju_hoan_North, Unetice_EBA) (BedouinB, European)

D(Ju_hoan_North, Yamnaya) (BedouinB, European)

It's useful to plot D-stats against each other when looking for patterns in the data. For instance, in the graphs below Basques and southern French often look like obvious outliers. What this means is that there's something peculiar about their genetic history. What might that be I wonder? Any suggestions?

The present-day Polish samples, eleven in all, came from here. Most of the other samples are from the Allentoft et al. (Rise Project), Haak et al. and Lazaridis et al. datasets, all of which are publicly available.

