skip to main |
skip to sidebar
Just recently we saw the discovery of a new haplogroup called R1a1a7, which I boldly christened a Slavic genetic marker (see here), because it appeared to be linked to Slavic expansions around Europe. Now it seems this haplogroup also harbors an STR cluster which could be a signal of a secondary demographic event roughly within the boundaries of modern day Poland. Peter Gwozdz, the R1a enthusiast who first spotted the haplotype, and suitably named it the "P type", estimates its time of expansion to be less than 1500 years ago. This could mean it was one of the lineages intimately linked with the formation of the first Polish state. He's written a two-part article in Genetic Genealogy touching on the subject...
Y-STR Mountains in Haplospace, Part I: Methods
Y-STR Mountains in Haplospace, Part II: Application to Common Polish Clades
It's a complex piece of work, and migraine inducing if you're not into Y-STR based genealogy. For those interested in Polish ethnogenesis minus all the calculations and stuff, here are a couple of interesting quotes...
If the hypothetical P type is a valid clade, about 8% of Polish men belong to this clade, with confidence interval 6.4% to 9.6%. TMRCA is probably more than the raw ASD age of 1,601 years, perhaps more like 2,000 to 3,000 years ago. This is quite young for such a large clade, so P type must have significantly expanded in population, less than 1,601 years ago, perhaps 1,000 to 1,500 years ago. Because P type is isolated in Polish Y-DNA, it may represent an immigration from elsewhere, or it may represent an older population that almost went extinct. Either way, there seems to have been a small closely related founder population before the expansion. It has not escaped my attention that Poland as a nation appears in written history a little more than 1,000 years ago. It is fascinating to speculate that a rapid expansion of population, including P type, occurred shortly before the appearance of Poland as a nation, although further discussion is beyond the scope of this article.
...
As shown above, it seems P type, from Haplogroup R1a1, went through a rapid population expansion somewhat less than 1,500 years ago in the area that is now Poland. Y type is a type from Haplogroup I1 that is also young, perhaps younger than P type, and also concentrated in Poland. Table 1 provides hints that more Polish types will be identified soon, as more data accumulates in the Polish Project. It makes sense that a population expansion would not be confined to a single type, but might include other types and indeed other haplogroups, according to the population mix in the population experiencing the expansion. It is tempting to anticipate that additional data will point to multiple Polish types, of various sizes, with about the same population expansion time. If the data comes out that way, it will suggest the time of the growth of the Polish nation.
Gwozdz also makes some useful comments about that recent Underhill et al. paper on R1a1a7. My opinion was that the authors overestimated the age of its defining mutation (M458) by about 3 times. However, he's of a somewhat different view.
Underhill reports that the highest coalescent time (age) for R1a1a7 is among Polish, 10,700 years. This calculation uses an average mutation rate 0.00069 per 25 years, which includes a factor of 1/3 to account for the stochastic reduction of variance in slowly growing populations, as demonstrated by Zhivotovsky (2006); see Part I section "Mutation Rates." Zhivotovski is a co-author of the Underhill paper. Zhivotovsky (2006) shows that for large populations, or for rapid growth, the mutation rate factor approaches one, which is to say that the coalescent time comes out up to 3 times younger. The coalescence time for R1a1a7 including P and N types may well be as much as 10,700 years ago. This is because P type and N type have quite different STR values. But many of the modern carriers of M458 in Poland come from two population expansions that are much more recent, because each type is much less diverse (lower ASD) than the total. Since both are large young types, the simplest explanation is that each grew recently from a small founding population. On the other hand, the two types may represent immigration of two tribes from distinct regions. Or the situation may be more complicated, with any number of immigrations and any number of population expansions, at different times. The important observation from this example: a clade may be composed of one or more daughter clades that are much younger than the parent. Indeed this is no surprise, since M458 is much younger than the parent R1a1a (R1a-M17) clade.
Underhill tentatively identifies M458 as a mutation from the Mesolithic, a reasonable conclusion. However, the corresponding haplogroup may not have grown much at first, or may have grown and then dwindled over the millenia. It appears there was a recent resurgence, from two or more small founding populations. It will be interesting to identify the cultures of those founders.
Peter Gwozdz, Y-STR Mountains in Haplospace, Part I: Methods, Journal of Genetic Genealogy, Volume 5, Number 2, Fall 2009.
Peter Gwozdz, Y-STR Mountains in Haplospace, Part II: Application to Common Polish Clades, Journal of Genetic Genealogy, Volume 5, Number 2, Fall 2009.
Peter A Underhill et al., Separating the post-Glacial coancestry of European and Asian Y chromosomes within haplogroup R1a, European Journal of Human Genetics advance online publication 4 November 2009; doi: 10.1038/ejhg.2009.194
Nature has just published a very interesting article on the discovery of a new type of R1a1a, defined by the M458 marker. The data included in the report firmly puts present day Poland in the driving seat as the place of origin for this lineage, known as R1a1a7. Here's a nice map...

Peter A Underhill et al., Separating the post-Glacial coancestry of European and Asian Y chromosomes within haplogroup R1a, European Journal of Human Genetics advance online publication 4 November 2009; doi: 10.1038/ejhg.2009.194
However, as per above, the authors claim that R1a1a7 has an age of about 10.7KY. This, they say, makes it a signal of migrations carrying agriculture from Central-East Europe to present day Ukraine and European Russia. Unfortunately, that doesn't make any sense, because M458 is very rare in Scandinavia, which was largely populated from North/Central Europe after the Ice Age. Recent work on the population movements around the Baltic have shown that both R1a1 and I1a moved up from Germany and Poland into Sweden. So why was only one case of M458 discovered up there in this study?

T. Lappalainen et al., Migration Waves to the Baltic Sea Region, Annals of Human Genetics, Volume 72 Issue 3, Pages 337 - 348, doi: 10.1111/j.1469-1809.2007.00429.x
My take on what's happened here is that the authors grossly overestimated the age of M458, by about three times. The real figure is probably somewhere between 3 and 4KY. So it's pretty obvious what we're dealing with here are the various migrations of Slavs around Central and Eastern Europe, probably starting in the upper Vistula basin. These population movements took place well AFTER previous waves of R1a1 moved north and west from or via present day Poland.
Based on their inflated age and expansion time estimates for M458, the authors also conclude that it's unlikely there were any major post-Ice Age movements from Eastern Europe to Asia. This implies they trust their own methodology more than the recent results of ancient DNA studies, which clearly showed that European groups carrying R1a1 migrated in a big way to South Siberia during the Chalcolithic and Bronze Age (see here). Indeed, the west to east movements of these Scytho-Siberians were also tracked by a recent cranial study of their remains (here). So well done on finding the new R1a1 marker, but geez, there's something not quite right there with those haplogroup age estimates again. When will that change I wonder?
I was lucky enough to get my hands on these PCAs recently as part of an analysis of my own genome at a university earlier this year. They show most of the HGDP samples and how they relate to each other across various dimensions. I'm the white square labeled 1. First, here's the key...

OK, in this view the Europeans are at the top, North and East Asians and Amerindians to the left, and Sub-Saharan Africans in the bottom right corner. I'm with the Orcadians (purple), while just below me are the Adygei from the Caucasus (aqua), and to my lower left the North Russians from Vologda (green). The streams running from Europe towards the Sub-Saharan Africans are made up of Middle Easterners and North Africans, as well as some South-Central Asians. This probably indicates admixture from south of the Sahara in those stretched out clusters.

On this plot, the Asians and Europeans move down, the African up, while the Amerindians veer off to the top left. The Vologda Russians are again just left of the main European cluster, while I'm still with the Orcadians.

Here it becomes clear why the North Russians appeared consistently to the left of the main European cluster on the other plots. Not only do most of them show influence from the southeast, which pulls them towards the South-Central Asians, but they're also all clearly attracted to the North and East Asians. I'm just beyond the Orcadian cluster now, a little closer to the Adygei and the South-Central Asians. But I guess this is in line with geography, with Poland being well southeast of the Orkney Isles.

What I find fascinating about this plot is the way the more northernly Eurasians, from Yakutia to Basque country, are pushed above most of the others. On the left, the Yakuts in green hover just above the East Asians, while in the European zone, the Vologda Russians barely stay within the plot. Interestingly, the French Basques (orange cluster) are positioned just above most of the French (red). I suspect something really ancient is causing this behavior.

Btw, here's the plot I posted here earlier this year. This is what the situation looks like once the Asians, Africans and others are dropped, paving the way for intra-European variation to assert itself.
I've just had a look at the new Relative Finder tool at 23andme, which is still going through its beta testing phase. Apparently, there's one person in the database who shares a big enough chunk of DNA with me to qualify as a potential 4th cousin. Another 32 share segments large enough to be classed as possible 5th cousins. I have no idea who most of these people are at this stage, because they have to agree to take part in the beta test for me to contact them. But I can see some of their details, and it appears most come from various parts of Northern and Central Europe, while their Y-DNA haplogroups include I1*, R1a1, R1b, I2a2b and N1c1. My list also has many more "distant" cousins (for example, >10th), and at first glance their origins look even more random. It'll be interesting to see how this develops.
Edit: check out the Just What Is A Fifth Cousin? video on the Relative Finder page. Hilarious stuff.
Here's a random pic of a mummified Scythian chieftain from a burial mound in Tuva, Siberia (500 BC). As far as I know, his DNA was used in recent studies to establish the genetic ancestry and pigmentation traits of the ancient Scytho-Siberians. It turned out he carried R1a1, and an autosomal STR profile similar to modern Poles and Russians. For more info see here.

Btw, I have no idea where this photo comes from, only what it shows, so I can't source it. If there's a problem with that, please let me know and I'll take it down.
If well picked, even a small number of SNPs can betray bio-geographic ancestry at intra-European level, especially if dealing with very closely related groups. This paper proves the point, in a way, pointing out that Lithuanians, Poles and Russians show a lot of kinship even when only 960 cancer susceptibility SNPs are considered. Having said that, some of the other results here see Jews clustering with Amerindians.
Using Bayesian clustering with a set of 960 single nucleotide polymorphisms (SNPs) we found evidence of population stratification in 864 individuals from New Hampshire that can be used to differentiate the population into six distinct genetic subgroups. We then correlated self-reported ancestry of the individuals with the Bayesian clustering results. Finnish and Russian/Polish/Lithuanian ancestries were most notably found to be associated with genetic substructure. The ancestral results were further explained and substantiated using New Hampshire census data from 1870 to 1930 when the largest waves of European immigrants came to the area.
Sloan CD, Andrew AD, Duell EJ, Williams SM, Karagas MR, et al. (2009) Genetic Population Structure Analysis in New Hampshire Reveals Eastern
European Ancestry. PLoS ONE 4(9): e6928. doi:10.1371/journal.pone.0006928.
A new article published in Molecular Medicine includes these plots showing genetic substructure within Northern Europe. As far as I know there are 5 Polish Americans included in the sample set labelled EEUR here...

Chao Tian et al., European Population Genetic Substructure: Further Definition of Ancestry Informative Markers for Distinguishing Among Diverse European Ethnic Groups, Mol Med. Published online 2009 August 24. doi: 10.2119/molmed.2009.00094.