search this blog

Wednesday, January 31, 2018

Modern-day Poles vs Bronze Age peoples of the East Baltic

Below are three of my staple Principal Component Analyses (PCA) featuring Baltic Bronze Age (Baltic_BA) samples from the recent Mittnik et al. 2018 paper (open access here). On each of the plots I've also highlighted modern-day Balts and Poles. The latter two PCA also include most of the other ancients from the said paper (listed here). They're not highlighted, but all of the relevant datasheets are available here, here and here, and easy to plot with the Past software.

No doubt, these Bronze Age peoples of the East Baltic, and in particular the four individuals from Turlojiske, Lithuania, are very closely related to modern-day Balts and northern Slavs. They may well be our ancestors, or at least close relatives thereof. This is argued and demonstrated well enough by Mittnik et al., and it clearly shows in my PCA, especially the first one, which is designed to focus on entho-linguistic-specific genetic drift in Northern Europe.

Nevertheless, overall, they do clearly show a higher cut of indigenous European Hunter-Gatherer ancestry relative to modern-day Northeast Europeans (note how in the second PCA the Baltic_BA samples pull towards the European Hunter-Gatherers compared to Balts and especially Poles). I'm not exactly sure what the explanation is for this yet. Indeed, there might be several different explanations. But generally speaking, it's probably in large part the result of post-Bronze Age gene flow into the Baltic region from Central Europe.

See also...

Early Baltic Corded Ware form a genetic clade with Yamnaya, but...

The genetic history of Northern Europe (or rather the South Baltic)

Genetic and linguistic structure across space and time in Northern Europe


Matt said...

(Comment based on the West Eurasia datasheet from the other post)

@All, for anyone who wants to check out plots of Davidski's PCA includes the new Northern Europe samples, I plotted his datasheet in PAST3 and see here:

Really check out the structure in Dimension 6, as there's some very nice structure there involving these new samples:

Few plots using the ancients only subset: (again check out Dimension 6).

Simple NJ dendrogram over all dimensions - (great structure, with NE Europeans clearly sitting with Baltic_BA, though less close on the dendrogram to Ukrainian and Russian HG. Baltic_BA+Hungary_BA may be a fairly nice model. Also mostly a split in structure between Narva and WHG. Narva don't sit closest to East Europe on this dendrogram, because Narva connects to WHG connect to Blatterhole, connected to Iberia_MN - bridge effect).

@Davidski, nice PCA! Only point I wanted to make is it seems most of the dimensions are dominated by Levant Neolithic sample I1701. That's dimensions PC4, but especially PC7, PC8, PC9, where I1701 has an outlying position and there's nearly no scatter among the others.

When I put the samples in a distance dendrogram over all the dimensions, I1071 looks messed up - (insanely long self branch!).

To a lesser extent, there's some funny stuff up with the other Levant samples, but looks less important.

The informative dimensions are really 1, 2, 3 and 6.

Presuming this is a statistical artifact, might be worth rerunning without I1701 (unless there's anything obvious that's up with the sample that can be fixed?). Then you can fit more information in the first 10 PCA dimensions.

(Re: the .dat file, there's also something up with the new samples having 10 dimensions and other samples having 9?).

I removed I1701 and tried to reprocess the PCA data through the PCA function in PAST3:

(which restructures things a bit and pushes down the relevant dimensions to the first 4).

Davidski said...


Yeah, I ran 10 PCs instead of 9. So just take out the 10th column.

Also, I1701 was projected, so just remove this sample and plot the PCA again.

Arza said...

But generally speaking, it's probably in large part the result of post-Bronze Age gene flow into the Baltic region from Central Europe.

Well said.
Vector of this shift will point at the source population.
Interestingly it won't be Trzciniec (autosomally speaking).

Why Early Proto-Slavs?

Because to form Proto-Slavs proper we need at least one more major genetic admixture event. We need a large input from Hungary_BA(-like) population.


Halberstadt_LBA:I0099 69.8
Hungary_BA:I1504 30.2
distance % = 0.1343

This Czech-, Slovak-, Slovenian-like Proto-Slavic population expanded and with this Hungary_BA shifted population people from the Baltic_BA cluster started to mix and formed a genetic continuum that resulted in a genetic and linguistic gradient between Latvia and the Moravian Gate (West Balts as a third "branch" of "Balto-Slavic", intermediate between Balts and Slavs).

This genetic gradient without a sign of a large-scale migration is visible here:

Halberstadt_LBA:I0099 44.5
Latvian 33.6
Hungary_BA:I1504 21.9
distance % = 0.1451

Latvian 82.05
Halberstadt_LBA:I0099 9.7
Hungary_BA:I1504 8.25
distance % = 0.3046

When and where this shift from Trzciniec CWC-like Early Proto-Slavs to Hungary_BA-shifted Proto-Slavs happened?

Clearly in the Lusatian Culture. This culture evolved out of the western part of Trzciniec and it interacted with various Bronze Age cultures incoming from the South. It's confirmed by archaeology that "Bronze Age Hungarians" were settling even north of Carpathians.

Of course there is also very strong and very specific genetic signal linking BR2/I1504 with Balto-Slavs and especially Poles (Cassidy, 2016).

Additionally, to represent CWC-like input, the models above use a sample from the western fringes of the Lusatian Culture. PL_N17 seems to be "to eastern".

Davidski said...


By the way, what do you think of those early Baltic Corded Ware? Identical to Yamnaya or not quite?

Matt said...

@Davidski, ah, yeah, looking at where the moderns plot, the other dimensions there (other than 1, 2, 3, 6) do actually make sense as structure among moderns, looks like there's just ginormous projection on to some of ancient samples from the same group on the other dimensions (but not 1, 2, 3 and 6 for some reason)... (Compared to the "cleanness" of the Ancient 67 PCAs!).

With some experimenting with nMonte, nMonte3 seems to have a bit of a difficult time localizing ancestry on this datasheet in quite as precise ways I'd expect. NE Europe gets appreciable amounts of Baltic_BA when given all ancient populations as input, but more difficulty for populations where there are a wider range of plausible LNBA Europe and Steppe MLBA candidates in Central and West Europe (e.g. NW Europeans, even Czech to a degree). Fits -

I guess projection, plus sheer volume of ancient samples in this datasheet, makes it more difficult to select most expected ancestors compared to the Ancient 67 datasheet, so people will have to pick and test sensible models rather than just put all the ancients in a hat and see what nMonte3 finds.

Early CWC from Baltic look distinct to me, mostly like Steppe_EMBA, mainly some extra HG related admixture (Baltic / Ukraine?) and maybe some low levels of Anatolian farmer related ancestry? At least more distinct than Afanasievo, Poltavka, Yamnaya_Kalmyika and Yamnaya_Samara do from each other.

With nMonte3

CWC_Baltic_early:Gyvakarai1 -
Steppe EMBA / Steppe CA: 52.8 (Afanasievo,16.8, Yamnaya_Samara,15.2, Poltavka,12.4, Yamnaya_Kalmykia,4, Samara_Eneolithic,4.4),
Extra SHG / EHG / ANE : 17.1 (EHG, 10.6, AfontovaGora3,2.4, SHG 2, Latvia_MN2,1)
Other Euro HG: 12.4 (Ofnet,3.6, Continenza,2.6, Bockstein,2.4, Ukraine_HG ,1.4, Narva_Estonia,1, Narva_Lithuania,0.6, Koros_HG,0.4, Rochedane,0.4)
Early European farmer: 8 (Levant_N,5.4, Boncuklu_N,1.2, Tisza_LN,0.8, Iberia_Chalcolithic,0.4, LBKT_MN,0.2)
Extra CHG / Iran_CA : 6.6 (Kotias: 5.2, Satsurblia,3.8, Iran_Chalcolithic,1.4)
Other: 0.4 (Armenia_Chalcolithic,0.2, Iran_N,0.2)

Steppe_EMBA / Steppe CA: 60.8 (Yamnaya_Samara,34, Poltavka,15.4, Yamnaya_Kalmykia,11.4
Extra EHG / SHG / ANE: 31.2 (EHG,19.2, SHG,11.8, Ukraine_N1,0.2)
Early European Farmer: 7.2 (Portugal_LN,4.8, Levant_N,2.4
Extra CHG: Kotias,0.8

Not to be taken literally, but looks like extra HG ancestry mainly?

rozenblatt said...

@Davidski Would it make sense to add Sunghir 6 to the plot? Or this sample is too late to be interesting?

Davidski said...


Yes, I think the early Baltic Corded Ware are being pulled west of Yamnaya a bit, probably due to admixture from eastern Ukrainian HGs.


Sunghir 6 is late and also highly contaminated.

Samuel Andrews said...

So Baltic BA were definitely early Balts or Balto-Slav something. I can see it in the mtDNA as well. Looking at WHG levels in Slavs, it would make sense that early Balto Slav communities had Narva admixture and therefore lived pretty far north.


Looks like Baltic Corded Ware is Yamnaya plus some UkraineHG. Interesting. This could have implications on the precise Steppe ancestry percentages in some modern eastern Europeans.

Samuel Andrews said...

In this PCA early Baltic COrded Ware comes out basically pure Yamnaya with some WHG admixture. However, it is important to remember HG admixture is exaggerated and EEF admixture underestimated. For example, Baltic BA and modern Balts come out almost as a two mixture between Yamnaya and LatviaHG.

Rob said...

WIth scaled dimensions and removing huntergatherers,

Baltic CWC:
"Yamnaya_Samara" 62.6
"Germany_MN" 37.4

Baltic LBA-IA (Trziniec)

"Blatterhole_MN" 50.05
"Yamnaya_Samara" 49.95

So a gradual shift toward EEF, due to 'contacts', the exact specifics will come down to uniparentals.

Chad said...

Further complicating the picture from BA to modern will be the Siberian admixture. Some graphs including Finns or Saami might be interesting.

Matt said...

I checked out the Northern European specific PCA and merged in with the Welzin BA samples, few plots:

@David, do any of the CWC Baltic or CWC Baltic Early samples work on the Northern Europe specific PCA? I'd be interested to know if they just overlap with the same positions that the Steppe_EMBA / Steppe_MLBA / Corded Ware Germany have or whether they are distinct or fill in some of the "hole" in the plots I've posted between Steppe and Baltic BA? Or if they just overlap with NE Europe generally.

Matt said...

@Sam, I think you could well be right on the Euro HG being slightly compressed close on the West Eurasia PCA, though I haven't checked it out in depth.

Davidski said...


David, do any of the CWC Baltic or CWC Baltic Early samples work on the Northern Europe specific PCA?

I'll check.

Samuel Andrews said...

This is what Global10 gives....

Modern Lithuanian/Latvian
MN Farmer-30%
Narva HG-27%

Baltia BA-Kivutkalns.
MN Farmer-17%
Narva HG-45%

Baltic BA-Turlojiske
MN Farmer-23%
Narva HG-29%

Baltic CWC-Plinkaigalis242
MN Farmer-3%
Narva HG-14%

Baltic CWC-Gyvakarai1
MN Farmer-13%
Comb Ceramic-14%

Baltic CWC-Plinkaigalis241
MN Farmer-32%
Narva HG-5%

Baltic CWC-Kunila2
MN Farmer-30%
WHG misc-3%

Samuel Andrews said...

Interesting stuff. A unique process happened in the east Baltic. HGs makeup an important part of the formula that makes up Balts unlike in northwestern Europe where it is just MN farmer+Steppe.

Don, don, don....

MN farmer-13%
Narva HG-31%

MN Farmer-29%
Narva HG-24%

Samuel Andrews said...

Modern similarities hide old distinctions. A single always gradually becomes more similar. The Baltic used to be even more of a WHG-rich outlier in the Bronze age. Myceneans and BalticBA are more different than modern Balts and Greeks. It'd make no sense to put both Myceneans and BalticBA in the same Bronze age European genetic category but sort of makes sense to put modern Balts and Greeks in the same European category.

Point is, although Finns look like Balts that similarity could have been created by a different process. They may not just be a recent import from mainland Europe with some Siberian admixture. Maybe SHGs & EHGs (outside of Yamnaya) made a contribution to Saami and Finns. Maybe Baltic Narva is being confused with HGs from Fennoscandia.

Davidski said...


Simon_W said...

Makes sense that Baltic_BA is ethnically Baltic. These samples are already that late in absolute terms, and where else should the Balts have come from?

Also good to see that the odd position of RISE598 here
was erroneous and apparently caused by the low coverage.

This map shows the Baltic tribes around 1200 AD, i.e. long after the Bronze Age, but at least we can say that the Turlojiske samples fall into the area that was Sudovian in Medieval times:

Matt said...

@Sam, I'm not sure it's too likely Finns could have been from a *wholly* different process, but have definitely had a distinct history since something like the early Bronze Age, I'd say. The population history looks quite different from others and like they've been at a small population size for a long time: - Using their IBD-based method IBDNe (Box 1), Browning and Browning [32] considered population growth in two European populations, 5,200 individuals from the UK and 5,402 Finnish individuals. While all the above results have been based on sequencing data in order to capture the statistics (mostly the SFS) with little ascertainment bias, IBD-based methods, such as in this study, can be based on genotype data. They estimated population size for each generation during the last 50 generations (∼1,250 years) and showed that the UK population had grown from ∼0.1 million to 27 (21–34) million individuals during this time. Similarly, the Finnish population grew from about 3,000 individuals to 0.38 (0.33–0.46) million, with the growth occurring mostly during the last 15 generations (375 years) [32], consistent with the re-population of Northern Finland about 300–500 years ago.

Or from Iain Mathieson's thesis "Genes in Space" - (I couldn't find the thesis online, so tried to upload it to Scribd for posterity, as it's a good one to look at for anyone interested in population genetics - Very restricted population size until recently.

Simon_W said...

As a grandson of an East Prussian German I'm of course excited to have now, in the Turlojiske samples, a reasonable proxy for the old Prussians. So I quickly calculated the average of the two Turlojiske individuals in the Global10 data sheet and used it in nMonte to model my paternal grandmother's coordinates.

But as her position in the West Eurasian PCA shows, she deviates from the German average not into a properly Baltic direction, but towards something inbetween Baltic and BR1/Hungary_BA:I1502, see P_grandma here:

And indeed, according to nMonte she takes a lot of Hungary_BA:I1502 as well, besides the substantial Turlojiske-like admixture:

"Hungary_BA:I1502" 34.3
"Turlojiske:average" 30.3
"England_Anglo-Saxon" 17.7
"Halberstadt_LBA" 17.7
"Dutch" 0
"Nordic_IA" 0

The explanation of this isn't very obvious. I can just speculate that both her German and her old Prussian ancestors may have absorbed a lot of the BR1-like old South Baltic population exemplified by the Tollense valley samples.

Simon_W said...

A thought just occured to me about the north European PCA and David's ethnic interpretation thereof: Basically, what the PCA shows is the ancient continuum of the genetic landscape of the northern half of Europe. And now we subdivide it according to the *modern* ethnic affiliations of the populations included. As an inevitable logical consequence modern south Germans for example end up in the Germanic cluster, because they are ethnically Germanic. But I'm pretty sure they are not predominantly Germanic by descent. Just as holds true for the modern English, the majority of the ancestry of modern South Germans is derived from the pre-Germanic population which would be Celtic. This can be shown by modelling their Global10 data in nMonte, for example, though there are other ways to prove it. And so, while most of the few ancient samples with known ethnic affiliation so far fall into the correct clusters, it's not a safe inference to conclude from the PCA position of an ancient sample with controversial ethnic affiliation to its ethnic identity. For example, I predict that Iron Age samples from Austria would fall into the Germanic cluster. Yet we know sure as hell that they were not Germanic, they were Celtic Norici. Likewise it would be highly dubious to conclude from the position of the Tollense warriors that they had a Slavic-related language and ethnic affiliation. What's known is just that their DNA and the DNA of related tribes went into populations that in modern times are Slavic. This may be hard to accept for some people, but I for instance have no problem in admitting that the largest part of my ancestry is La Tène Celtic, I'm saying this proudly, even though my native language is German.

Onur Dincer said...

@Simon W

How do you explain the inflation or appearance of Y-DNA N haplogroups among Baltic speakers sometime after the Bronze Age (apparently with very little influence on autosomes)?

I would also like to hear Kristiina's opinion on this issue.

Matt said...

@Simon_W, I think it's a bit of a strong read of Davidski's position (though he can speak for himself of course!) that he has suggested that this is measuring drift that has only occurred post-origin of the Baltic-Slavic languages, and was restricted to populations of speakers of Baltic-Slavic languages, and that any ancient sample which displays this drift likely spoke a Indo-European form which was ancestral to those.

I think we're away from the idea that, say, the early Bell Beakers necessarily spoke forms of the early IE dialect continuum that were ancestral to later Celtic / Italic / Germanic, just because they seem to share some drift with present day and almost certain historical speakers of those languages.

Good comment though nonetheless, more of a thing of I don't think it's accurate to be posing this as an idea in opposition to others, than that I disagree with what you are sayng.

Simon_W said...

@ Onur

Good point! Before these results were released I would have guessed that the N comes either from the Narva culture or from the Comb Ceramics culture (CCC). But apparently the Narva culture was dominated by other haplogroups. There isn't a lot of yDNA from the CCC available, but anyway, the numerous Baltic Bronze Age samples don't have a single N among them.

A possible scenario to explain this, IMHO: The Baltic languages are divided into West Baltic and East Baltic. However, West Baltic is extinct, and the two surviving modern Baltic languages belong to the East Baltic branch. If you look at the map I linked: Latvian comes from the Latgalians, and Lithunian from the Lithuanians, and these were the East Baltic peoples. All the other peoples were West Baltic, and these were assimilated - by the Eastern Balts, and partly by the Germans, Poles and Belarusians. So apparently there was an east-west migration among the Balts that post-dated the Bronze Age. And the two Bronze Age sampling locations in this paper are rather western. So maybe N was more common among Bronze Age eastern Balts.

Simon_W said...

@ Matt

Fair enough. I also found it eye-catching how far from each other the two Bohemian Slavs plot, it's amazing if we bear in mind what a small place Bohemia is. I suspect one has more Proto-Slavic ancestry than the other one. And on top of that I'd bet that a Bohemian Celt from the La Tène culture of Iron Age Bohemia wouldn't plot with the Irish nor with the French; I'd expect them to have been a bit more eastern, but by how much I don't know.

Onur Dincer said...

Thank you, Simon! I would also like to hear Kristiina's opinion on this issue since she has a pretty good grasp of N subclades.

RKV said...

Matt, I'd like to speak to the Bohemian side. I and my several Czech cousins are R-YP1700 (under Z92). So 3000 years ago our ancestors were in southern Lithuania (Turlojiske 3 may well be a direct ancestor of ours). Given the paper trail genealogy, we’ve been south of Plzen for at least the last 400 years, Clearly Slavs went West. We're the living proof.

Queequeg said...

Re paternal N and Balts: TMRCA of the whole "Baltic" N-M2783 is just 2700 years and if I'm right, the only N-M2783* has so far been found in Finland, also the preceding mutation levels point to Northern Baltics, if not eastwards. So, I'd say that this lineage came to Baltic area in the Bronze Age, somewhere from the Volga area, speaking Uralic, see here:

Simon_W said...

As a grandson of an East Prussian German I also have to admit that the subjugation of the Prussian Balts by the knights of the Teutonic Order was neither fair nor easygoing. It was ruthless and brutal, and the Old Prussians had bad luck not being at the height of the technical developments of the time. So the new German tribe that was born out of the fusion of Baltic Prussians and German settlers was born out of tears, so to say. You can see that for yourselves in this Lithuanian historical movie about the great Prussian uprising led by Hercus Monte:
As far as I know it's the only historical movie dealing with the old Prussians.

weure said...

@Davidski it would be nice if you can trace this (or deny) the next case: modern NW Europeans vs. Polish-Silesian Unetice/Bronze Age people.

Rise 150 Unetice, Przeclawice, Poland, F999948 (gedmatch),
has a quite NW European profile, in Eurogenes K15 he scores:

Using 1 population approximation:
1 Danish @ 7.337605
2 Norwegian @ 8.207634
3 North_Dutch @ 8.347853
4 West_Norwegian @ 8.393320
5 West_Scottish @ 9.097486
6 Swedish @ 9.154801
7 Orcadian @ 9.568080
8 North_German @ 9.797962
9 Irish @ 10.019912
10 North_Swedish @ 10.076278

Rise 98 Lilla Beddinge Scania, about 2000 BC has a match on 3cM on archaic matches on Gedmatch with Rise150 (on admixtures Rise98 comes close to Unetice and BB Germany).
My mother modern day North Dutch has also a match on 3cM on archaic matches on Gedmatch with Rise 150.

I guess that Unetice spread in (E)BA a bunch of genes to NW Europe/Southern Scandinavia. So Unetice had a severe impact on the (modern) NW European/ Southern Scandinavian populations.