Wednesday, June 24, 2015

Improved resolution of E-M215 (aka E3b / E1b1b)

A new paper has appeared with a a focus on Haplogroup E, and mostly focused on E-M215 and E-M35, with a moderate level of improvement in resolution from what we used to know.

Basically, at first glance, the major novelty with respect to E-M215 is that all E-Z830 (x M123) lineages are united under a new mutation dubbed V1515, and that the former solo lineages of E-M35, i.e. E-V92 and E-V6, now have a home and are included within this unification. In addition, the above named unifying mutation, V1515, apparently has a bifurcated structure itself, with one younger branch having the sole representation in the Southern parts of Ethiopia and further South, and the more diverse (hence ancient) branch being represented in the Northern parts of Ethiopia and further North.

New basal haplogroup E mutations were also apparently found.

The paper is Open access , and I will analyze it further in the coming days , but I just wanted to plot the Eastern African E-M215 variant frequencies for now.

UPDATE (6/26/15) - Added NAfrica E-M215 frequencies
UPDATE (6/26/15) - Added new mutation rate
The new fossil calibrated mutation rate has been added to the TMRCA Calulator, unfortunately 95% CI values have not been given (or at least I could not find where they have been given), in any event, central TMRCA estimates for this new mutation rate are a bit slower than mutation rates derived from the other ancient DNA calibrated sources, specifically,  ~ 4%  and 12% slower than Karmin (2015) and Fu (2014) respectively.

UPDATE (6/27/15) - Comparison with YFull TMRCAs
I have created a table for the TMRCA of the major nodes in E-M215, in order to compare with YFull’s estimates so that we can ‘fill in the gaps’ for the Nodes that have not been given estimates in Trombetta (2015). YFull uses a mutation rate that is almost exactly identical to Fu (2014)’s  Ust-Ishim calibrated rates, so naturally some of the TMRCA’s would be closer to today than the Trombetta estimates, as pointed out above.



  1. I don't think this study strengthens an East African origin of E-M35. I'd say it has made an East African origin a lot less likely, actually. This study has not only revealed that East African V42, V6 and M293 (previous studies described these as separate "basal" clades within M35) share a common East African ancestor, which they term E-V1515, but that all East African E-M35* paragroup samples now belong to this lineage.

    Here's the phylogeny to make it easier to follow:

    The most ancient split within M35 is E-V68 vs. the rest. E-V68 and its more successful descendent M78 are probably North African. The next to split off is V257 (ancestor of M81), which just like V68 is mainly North African. V68 and V257 both have some spillover into southern Europe. Finally comes Z830, which is the common ancestor of E-V1515 and E-M123. The origin of the latter lineage is unclear, as it's shared between West Asians and Ethiopians. But M123(xM34) has a notable presence only in Northern and Southern Egyptians, and wasn't found at all in the East African or West Asian samples... Pointing in the same direction, once again; toward North Africa.

    So if anything, this paper provides a very strong case for a North African origin of E-M35. The only thing East Africa has on the North is a very minor frequency of E-M215(xM35) (ancestor of M35). But E-M215 is very ancient, dated to 39 kya in this study, compared to 25 kya for all of E-M35. So from now on, I will advocate a North African origin for E-M35. I would like to see high resolution Omotic samples, though, it would be very interesting to see what their M35 lineages turn out to be. E-M215 as a whole still looks East African.

    Anyway, there are some interesting results for the East African samples, even within E-V1515, the only decisively East African subclade of M35.

    E-V1515*, the most ancient split within V1515, was only found in the Tigre and Nara of Eritrea (Nara live close to the more demographically significant Tigre). Indicating a northern route for the spread of E-V1515, as was suggested by the authors.

    V42 seems Nilo-Saharan-centered, as it's mainly found in the Kunama and Nara, as well as the Beta Israel sample, which also has rather high A3b2.

    The large sample of Eritrean Sahos is almost 100% E, of which around 90% is E-V22, with some minor E-V6 (frequent in their Afar neighbors and linguistic cousins).

    The most common E lineage in Tigrinya speakers is V1785*, at around 30%. V1785 is also the ancestor of E-V6, which is very frequent in Afars and present in Sahos.

    This study has found East African brother clades of SE African ("South Cushitic") E-M293, descended from a common E-V2881 ancestor. Interestingly, Oromos, along with Somalis from Kenya/Ethiopia/Somalia, are the only ones with a high frequency of this lineage, at around 20%. Note that in the Somalia sample, it's assigned to a subclade (V1792). This supports the shared ancestry between East and South Cushites that has been suggested by linguists. Also, the shared high V32/V2881 lineages of Somalis and Oromos indicates common ancestry, just as linguists have suggested a shared ancestor between Somali and Oromo within East Cushitic.

    Moving onto the non-V1515 lineages... M34 is conspicuously low/absent among the northern Horner samples from Eritrea, including the Tigrinya from Eritrea/Ethiopia. However, it's more frequent farther south, in the Ethiopian highlands. This is not consistent with the old suggestion that M34 is a recent or "Semitic" introduction, which was already made less probably when Plaster found M34 at high frequency in southern Ethiopian Omotics (the main source of M34 in the Ethiopian highlands?). Considering the presence of M123(xM34) in Egyptians mentioned above, it may turn out to be from an ancient migration from North Africa into the heart of the Ethiopian highlands.

    Great study overall, very informative.

    1. That's my post from ABF:

      I would appreciate if you refer to the source (me) before copying my posts.

      And thanks to Ethio Helix for compiling the results.

    2. "The only thing East Africa has on the North is a very minor frequency of E-M215(xM35) (ancestor of M35)"

      False. It's all M281 there, and found also in Yemen and other Arabian countries. Di Cristofaro et al. (2013) instead has a Khorasanian M215*. He didn't test for M281, but I compared this haplotype to M281 ones and they were totally different. Plus there are other unresolved P2*/M215* in Russia (; there are two P2+, M35- men in Vladivostok)

    3. @Lank, It was so indepth I felt like sharing it here too.

    4. The African Scientist's analysis is good, but I am of the opinion that we need more samples from SW Asia/Levant/Arabian Peninsula analysed with NGS. That will probably pull the place of origin of E-M215 and E-M96 itself towards that macro-region

    5. It's not false at all. E-M215 without the M35 mutation (in this study, those samples all belonged to V16/M281) is found in East Africa and not the North, just like I said. This study only reports it in 3 Ethiopians and 1 Yemeni. It's completely absent in the thousands of E-M215 samples in this study from Europe, West Asia (excluding the single Yemeni sample) and North Africa.

      Yemen is not exactly a reservoir of E-M215 diversity, so the most parsimonious explanation is that it's a relatively recent lineage arriving from Africa.

    6. I wonder when we'll get some proper Omotic M215 samples at this resolution. The Wolayta don't count as they have been heavily influenced by Cushites.

      Some years back, a mtDNA study reported a relatively high frequency of mtDNA L6, which was supposedly "rare" in Africa. A few years later, however, high frequencies of mtDNA L6 were found in a couple of remote Ethiopian groups. So let's test the region properly before imagining non-African origins for these erratic lineages in the Middle East.

    7. It's entirely false, since you said that "E-M215(xM35) (ancestor of M35)" is found there, while now you acknowledge the fact that it is all M281+, so no "ancestor of M35" found in northern East Africa. Further, there's another error. Going by Y-STRs, Yemeni M281 is the oldest. Plaster (2011) has 4 M281 samples from Ethiopia, they are all identical to each other. Since Ethiopia has the highest frequency of this haplogroups, its haplotypes represent the modal. The haplotypes of two Saudi Arabian M281 found in the M35 Phylogeny Project are almost identical to Ethiopian M281 haplotypes, but a Yemeni M281 haplotype in the same project shows even more differences. Therefore, Arabian Peninsula M281 is older than Ethiopian M281. You can see this on your own, the link to the project is here and the link to the Plaster thesis is somewhere here on the blog.

    8. The ancestor I'm referring is E-M215, obviously not the modern M281 or M35 derivatives. I wasn't trying to say that modern lineages are the ancestors of their own ancient ancestor (dated to 39 kya in this study), lol.

      If you want to put so much stake in 1 Yemeni lineage, then that's up to you. I'll be awaiting more samples, particularly from southern Ethiopian Omotics.

    9. Oh, I put so much stake on 1 Yemeni lineage but you put a lot more on Ethiopian samples only because of your (and of a lot of other people around here) prejudices ("Yemen is not exactly a reservoir of E-M215 diversity")

    10. I now see some of the data I requested was right in front of us. The Maale Omotics from Plaster have 3/69 E-M281, the highest frequency in the world! That's extremely interesting. Clearly, these distinct peoples (Omotics) are of great importance for understanding the origin of E-P2 as a whole (Maale also have a lot of E-M329). I am very curious about their M281 diversity, and how their unresolved M35 lineages relate to the other bransches.

      It shows how pointless it is to analyze stray E lineages in Eurasia, without extensive coverage of Y-DNA from different parts of East Africa. It's easy to imagine how e.g. Y-DNA M281 or mtDNA L6 could have reached Yemen, most recently with the slave trade. The minor M281 in Amharas and Oromos may very well derive from Omotic admixture.

    11. I didn't pay much attention at first (too complicated to follow because of nomenclature, scatter of distribution maps, etc.) but, after some correspondence made me consider all the factors, it does seem like E1b-M35 could well have originated rather towards Egypt than towards Ethiopia (but Sudan/Nubia is still a good candidate IMO).

      What I find most interesting anyhow is that, if we dare to recalibrate the age estimates provided in fig. 1 so the D/E split fits the archaeologically documented OoA, which is c. 125-100 Ka BP, then E1b-M35 has a realistic age of 36-45 Ka BP, what fits very well with the early Upper Paleolithic and related Late Stone Age (LSA, as it's called in Africa). So it seems very plausible to me that E1b-M35 would be the main Y-DNA lineage associated with this prehistorical process (along with less important Asian lineages like J1, T, maybe even R1b, of more limited outreach).

  2. @ Lank,

    "The most ancient split within M35 is E-V68 vs. the rest."

    - Not quite, M35 has 2 variants, Z827 and V68. The V68 node is dated to 20 KYA (BEAST), however the Z827 node is not given a date in the supplemental material, but the Z830 node is given a date of 20 KYA, which would by necessity make the Z827 node older than 20 KYA, though obviously not older than the age of the upstream M35 node of 25 KYA.

    "The next to split off is V257 (ancestor of M81), which just like V68 is mainly North African."

    - The 'next' to split off within Z827 is V257 from Z830, however the date of this split is not given, but like I pointed out above it must be between 20 and 25 KYA. The actual node TMRCA of E-V257 is not given. Also note the presence of E-V257* in the Borana, something which was actually observed in Trombetta (2011), curiously, the authors did not mention much about either E-V257/M81 or E-M123/34, it seems that
    they were more focused on cataloging which lineage was 'Subsaharan' African and which wasn't.

    "But M123(xM34) has a notable presence only in Northern and Southern Egyptians,"

    That could be E-M4145 , see the comments in the previous Ethiopian Haplogroup blog post

    "So if anything, this paper provides a very strong case for a North African origin of E-M35."

    -Not according to the Bayseian phylogeoraphic probablities they report, by that metric, E-M35 has a 64% chance of being from East Africa and about 30% chance of being North African. You are better off saying the paper makes a strong case for E-Z827 being North African, the paper reports a ~70% chance of Z-827 being North African, but then it says E-Z830 has a 50% chance of being EA (40 % NA) so its all over the place. Unless E-M35 was Born in East Africa then travelled north giving birth to Z827 and V68 and then travelled back south giving birth to Z-830 and then back north in the form of E-M123 , in addition to a separate migration of E-M78 sublineages back down south from the E-V68 forefather, these probablities don't really make sense. I need to look into the software (Yu et al. 2010) but it looks to me like a frequency centroid approach that this software is taking and doesn't fully take into account many of the variables that need to be accounted for in determininig the orgin of a haplogroup. IMO, it is just more parsimonious to infer that E-M215/M35/Z827/Z830 were all born in East Africa, with 2 separate migrations, one resulting in V257 going North West and another with E-M123 going North East, ofcourse this is in addition to the separate Northern migration of E-V68, and the back migration of its sub-variants(V22,V32 & V12*).

    "I would like to see high resolution Omotic samples, though, it would be very interesting to see what their M35 lineages turn out to be."

    -I can agree on this point, without the addition of diverse Omotic, in addition to south highland cushitic samples, this story can not be completed.

    1. Z827 being North African is rather significant. The only other branch of M35 in this paper's updated phylogeny, V68 (as well as its successful M78 descendent), also appears North African in origin. The slightly higher M35 probability still being in East Africa is most likely due to the presence of M215(xM35) (i.e. V16) in East Africa. But the M215 node is so old (~39 kya) that its origin (which is still highly ambiguous considering it's less than 10 kya removed from the TMRCA of all of P2), is not that important for finding the origin of M35 (TMRCA ~25 kya).

      The probability for Z830 being EA showing as slightly higher than NA is probably because one of Z830's descendent's, V1515, is only found in East Africa (and farther south), whereas the other branch, M123, is found in East Africa as well as North Africa, and beyond. But, as you rightly point out, Z830 is rather close in age to Z827, which is likely North African. By contrast, V1515, the definitively East African Z830 sublineage, is dated to 12 kya, just 60% of Z830's age, less than 50% of Z827's age. M123 is most likely North African*, with M34 being the main sublineage that made it to West Asia and East Africa. This pattern strongly suggests a North African origin for Z830 as well, with V1515 being a relatively young lineage within Z830, or rather all of Z827.

      So IMO, the ancestor of M215 splits off from the rest of P2 ~48 kya, perhaps in the East African vicinity. M215 coalesces ~39 kya, with the ancestor of M35 migrating to North Africa at some point between 25-40 kya. In North Africa, the ancestors of Z827 and V68 diverge ~25 kya. In the next few thousands of years, still within North Africa, the ancestors of Z830 and V257 start diverging from one another. The ancestor of M78 also starts diverging from other V68 lineages, about 20 kya, in North Africa. At some point, the ancestor of V1515 makes it to East Africa, and V1515 coalesces there 12 kya. V12 and V22, which have TMRCA dates of ~8-12 kya, most likely arrive to East Africa fully formed (V12 mainly as V32). Minor V1083(xV22) survives among the Saho.

      *I have some major doubts about Pagani's small sample of WGS Ethiopians having such high frequencies of M4145. This new study by Trombetta didn't find any M123(xM34) in East Africans, nor was it found in the high resolution data you reported from Plaster's thesis.

    2. Unfortunately, there is no age estimate for M34 or M123 in this paper. YFull estimates 15 kya for the TMRCA of M34. It would be nice to know the TMRCA of East African M34 lineages. If they turn out to be older than the other M35 lineages in the region (V1515/V12/V22), it's possible East African M34 lineages may be associated with the spread of Omotic languages, from a source around Egypt (Ehret's proto-Afroasiatics?). That would fit with the East African M34 being concentrated in the southern-central parts of the Horn.

    3. I forgot about YFull, thanks for reminding me, I have used it in the past and it has reliable TMRCA estimates, I have updated the blog post to reflect YFull’s estimates for the major nodes. What we can see in strict temporal terms is that from the E-M35 trunk, E-Z827 branches off first, in fact very close in time to when the parental E-M35 node itself occurred, then ~ 5 KYA later E-V68 branches off, and another 5-7 KYA later, probably the ‘incubation’ stage, E-M78 branches off from E-V68. Meanwhile, E-Z830 branches off first from the E-Z827 branch only ~5 KYA after its formation, and another 5 KYA later E-V257 branches off from E-Z827. Within the E-Z830 branch it seems that E-M34 branches off first and then E-V1515 follows. So that is the actual temporal sequence of events.

      What does this tell us in spatial terms, Z827 likely occurred very close to where M35 occurred since they have practically the same TMRCA and that likely the only reason the Bayesian phylogeographic analysis is pointing to a high probability of Z827 occurring in North Africa is because of the high frequency of V257 (really M81) found in berbers, however, we can see here that V257 actually branched off after E-Z830 did. I tried accessing the Yu et al. 2010 paper but it was closed access, but I don’t think these phylogeographic probabilities are to be fully trusted, for instance it is saying that E-V13 has a 100% chance of being European and not only that, so does the node which unites it with V22 have a 64% chance of being European in origin , I just think that is nonsense.

      With respect to Afroasiatic, like I speculated well over 2 years ago when the new E-M35 substructure started to emerge, the major variants of E-M35 better overlap with Lionel Bender’s classification of Afroasiatic than it does with Christopher Ehret’s, more specifically Bender’s Macro-Cushitic sub-phylum, which includes Berber, Cushitic and Semitic parallels well with the E-Z827 family, whereas Egyptian matches with E-V68 and the independent Omotic branch would be matched well with isolated lineages like E-M329, E-M281, and *possibly* the unclassified E-M35 variants found in the Maale. My E-M34 analysis, while showing comparable and greater TMRCA’s with non-Ethiopian E-M34 haplotypes overall, showed clearly that the E-M34 found in the Maale was much younger than those found anywhere else, including in the ‘Amhara’. This young age of M34 in the Maale does not correspond well with ancient status Omotic has within Afroasiatic.

      Finally, the reason I pointed out E-M4145 is not necessarily because I believe Pagani’s results, I have my doubts as well, but to show that those E-M123(xM34) egyptian haplotypes need to be tested for E-M4145 instead of automatically assuming some type of ‘ancestral’ relationship to E-M34.

    4. M329 and M281 are simply way too old to be associated with early Afroasiatics. It would be informative to analyze their "M35*" lineages, though, within this updated phylogeny. Also to understand how their M34 and J lineages (interestingly, J(xJ1,J,2) was found in the Maale) relate to other Afroasiatic speakers.

    5. E-M329 and E-M281 would not be too old for Afroasiatic *if* Afroasiatic would have originated in the same region that these lineages putatively arose, they would simply have been left behind (with omotic) and not taken part in the migration that took Afroasiatic (x Omotic) to other parts of Africa and the levant.

      How haplogroup J fits into all this is still a largely unexplored (but important) area, several STR based analysis I have done, see here , here and here, of haplogroup J in Ethiopia point to a TMRCA of ~ 15 KYA, so perhaps non-Afroasiatic speaking peoples bring J into Ethiopia at a very early stage, they mingle with pre-exisiting proto-Afroasiatic peoples within Ethiopia and continue to dissipate the phylum further outside of Ethiopia, this is pure speculation, however it can gain traction if ever J variants are found outside of Ethiopia that are downstream from Ethiopian J variants.

    6. Why do you think J was seemingly lacking in South Cushites? South Cushites seem like they were mainly E-M293*, E-V32, E-V22 & T based on the uniparental data of South Cushitic speakers like the Iraqw and substantially South Cushitic admixed populations like the Maasai, Tutsi or Kikuyus; none of these groups have so far shown a hint of Haplogroup J from what I know yet J can be found all over the Horn; in East Cushites, Omotics and Ethio-Semites alike. It's rather puzzling...

    7. That's a very good question. Firstly looking at the paper of this blog post, and even before this paper really, we have known that only a small portion of the diversity of E-M35 that is present Ethiopia, is only available further south from Ethiopia, namely as you mention E-M293 to a large extent and E-M78 ( in the forms of E-V32 and E-V22) to a smaller extent. The question then boils down to, if haplogroup J was already in the horn by the time these migrations further south occurred, why didn’t J participate. You could however really ask the same question about the other sublcades of E-M35 that did not participate in this migration further south.

      There are three options really, the first option is that J did not exist in the horn when these migrations started to occur, the second is that populations in which haplogroup J was prevalent in, were not the ones directly related with these southern migrations, and the last is that J really did take part, but later on drifted out of the southern Cushitic populations via bottlenecks and what not. To me, the second option is the most likely.

      For any more clarity, haplogroup J in Ethiopia has to be very closely studied, unfortunately the only sufficient data to date is from the plaster thesis. I have conducted ASD based TMRCA estimates on this rather low resolution YSTR data, and interestingly, haplogroup J is more diverse in omotic speakers (Shekecho, Gamo and Kefa) and central Semitic speakers (Gurage) than it is in Northern Ethiopian Semitic and Cushitic Speakers (Amhara/Tigray and Agew), whereas the J found in the Afar is of an even more extremely low diversity type. If J entered Ethiopia from the north, then it is certainly not reconcilable with my estimates, moreover, The TMRCAs found are also very ancient, certainly older than the putative migration of Cushitic or ancestral Cushitic speaking peoples to the south eastern parts of Africa, traditionally held at ~5KYA.

    8. This comment has been removed by the author.

    9. My apologies for the abysmally late reply. At any rate I for the most part find the second option most plausible as well based on what you've said about how Haplogroup J stands in Ethiopia / the Horn of Africa. The idea that J coincidentally drifted out of Southern Cushites is possible but I find that somewhat implausible and again; based on what you've said and what I already knew about the Haplogroup J diversity in the Horn; it's hard for me to stomach that it didn't exist in the region when South Cushites started departing around 2 to 4 kya.

      Thank you for the detailed reply and sorry for taking so long to reply myself.