We’re excited to announce the release of Genetic Groups, a long-awaited enhancement of ethnicity results on MyHeritage DNA. With this very exciting addition, the resolution of MyHeritage’s ethnicity breakdown increases dramatically to 2,114 geographic regions, providing more depth and resolution than any other DNA test available today, and complementing the current 42 top-level ethnicities. This is a huge milestone for MyHeritage and a great step for millions of people fascinated by family history and curious to learn more about their origins.
This post describes the new release and is full of interesting real examples of Genetic Groups that our users have received.
Genetic Groups have been added for free to the ethnicity results for anyone who has already purchased a MyHeritage DNA test, and will be added to the results going forward for any newly purchased test. They are also available to anyone who has uploaded DNA data from another service, or will upload it now.
For each Genetic Group MyHeritage provides a drill-down page with detailed genealogical insights. It includes a description of the group, a heatmap showing the top places where the group’s members lived during different time periods (indicating the group’s migration patterns), common ancestral surnames and given names in the group, the most prevalent ethnicities among the group’s members, and other Genetic Groups that have close affinity to the group.
Users can review this information not only for Genetic Groups that they have been found to be members of, but also for any of the other Genetic Groups on MyHeritage.
As the most popular internationally-taken DNA test, particularly in Europe, MyHeritage has the unique advantage of having tested millions of users around the world who still live in their original communities (and have not migrated to other countries and married into totally different populations). This unparalleled data set allowed MyHeritage researchers to create and classify foundational Genetic Groups that can then be used to predict membership and accurately reveal true origins of millions of other users. By applying machine learning and other algorithms on the DNA data and family trees associated with those DNA kits, our research teams were able to identify the unique Genetic Groups and reveal their unique story.
Genetic Groups are fascinating and can offer insights into the paths your ancestors traveled over generations. They can point to a province, district, or region that your family originated from, or show migration from one location to another, and often tell you more about the family story.
For example, one of the 42 ethnicities available in MyHeritage DNA’s Ethnicity Estimate is Scandinavian. Until now, a user could have received ethnicity results showing a certain percentage of Scandinavian, and that’s it. But now, with Genetic Groups, results become much more granular and can pinpoint this user’s origins more specifically to Sweden, Norway, Denmark, or Iceland, and go deeper to pinpoint specific provinces and towns where the user’s ancestors lived, from among 100 Genetic Groups from Sweden, 103 from Norway, 49 from Denmark, and 4 from Iceland.
Genetic Group Founder Populations
Genetic Group Founder Populations are groups of people who lived in the same area for centuries, only marrying within that group, or perhaps migrating as a group. Over time, they formed unique genetic signatures.
The descendants of these founder populations have telltale DNA segments that they inherited from the group’s founding fathers and mothers. The MyHeritage DNA test detects these signatures, and by identifying millions of DNA microsegments and pinpointing to which Genetic Groups those segments belong, our research teams have been able to associate MyHeritage DNA users with those Genetic Groups.
Genetic Groups differ from Ethnicity Estimates, in that a particular group can be comprised of one or several ethnicities. Members of a group share geographic origins, but they may have members who come from diverse ethnic backgrounds. For example, a Genetic Group in Australia may consist of individuals whose ancestors originated in England and Germany, but after migrating to the same city became a melting pot and created a new group of their own.
Being a member of a Genetic Group does not come with a percentage; you are either a member of the group or you are not. If you are a member of a Genetic Group, it means that we found a sufficient number of segments in your DNA that associate you with that group.
Each Genetic Group in which you are considered a member is assigned a confidence level based on your DNA segment information. Being a member of a Genetic Group with low confidence can still be very interesting and it means that you are more distantly descended from that group.
You may be a member of several Genetic Groups because you may have inherited DNA from the founding fathers or mothers of multiple Genetic Groups. While the vast majority of our users will see one or more Genetic Groups in their results following this release, some users won’t receive any Genetic Groups in their results, either because they are descendants of ethnicities or groups that rarely use MyHeritage, or because their membership confidence levels were below our thresholds. This is our first release of Genetic Groups and we plan on improving them significantly in the future, adding more data, and providing more resolution, so there will be more groups of higher quality later on.
How we did it
The release of Genetic Groups would not have been possible without the huge efforts of the MyHeritage Science, R&D, and Product teams, led by Ran Snir, with incredible contributions by Ronnie Harpaz, Uri Gonen, MyHeritage’s CEO Gilad Japhet, and many others. The science behind this feature was developed by Regev Schweiger, Alon Diament Carmel, and Tal Shor, and additional scientists under the guidance of Yoav Naveh and Prof. Yaniv Erlich.
In a process that took three years, our teams aggregated a massive data set of DNA kits on MyHeritage and used them to create algorithms that determined the Genetic Groups. These algorithms were used on a huge reference set of 1.7 million DNA kits to cluster the DNA kits into groups based on millions of microsegments that were inherited from each group’s founding mothers and fathers.
We looked for connections between the members of a group to reveal and document each group’s unique story. We did this by leveraging aggregated ancestral metadata from the family trees associated with the DNA kits in the reference set, together with the genetic ethnicity results of the group members. The end result is that we merged some of the groups that were very similar into larger groups, and split some groups into smaller ones. This process included multiple iterations of fine-tuning, and at its completion more than 2,100 distinct Genetic Groups emerged.
We then examined an additional two million DNA kits that were not part of the original reference set to validate the accuracy of the Genetic Groups, and performed final calibration.
Like father, like son
We then researched each and every one of the 2,114 groups, examining geography, history, and migration patterns spanning centuries. We curated the groups and gave each group a meaningful name and description that reflects the story of that group. This meticulous task was led by our Senior VP of Product Management, Uri Gonen, who was one of the earliest employees of MyHeritage. Uri played a crucial role in bringing this feature to life, dedicating his work on this project to his late father Amiram Gonen who was a professor of geography. Prof. Gonen authored a pioneering book, Encyclopedia of the People of the World, published 27 years ago, which very much resembles the work that Uri has completed now in identifying the Genetic Groups of the world.
Genetic Groups at MyHeritage are Unrivaled
- MyHeritage has the largest number of Genetic Groups offered by any consumer DNA testing company, providing the greatest depth and resolution for ethnicity results.
- The accuracy of our Genetic Groups is very high. While Ethnicity Estimates are tied to a much smaller reference panel, Genetic Groups on MyHeritage are based on a reference panel of 1.7 million DNA kits, making the resolution amazing.
- As the international market leader in DNA testing, MyHeritage has sold more DNA kits outside of the United States, and particularly throughout Europe, than any other service. This has enabled us to test many elderly people in Europe who still live in the same town as their ancestors. By testing populations that have not migrated for generations, we were able to identify the unique DNA segments in founder populations and define Genetic Groups with greater precision and accuracy.
- MyHeritage is used by millions of people who have not only tested their DNA with MyHeritage, but who have also built family trees on our platform. This combination of DNA tests and family trees facilitated the research necessary to identify and label the Genetic Groups.
- Genetic Groups can be identified based on the native region of a specific population, for example “Venice, Italy”, or based on a migration pattern such as “German Settlers in Georgia, USA”. This can give users a better idea of their origins dating back generations, allowing them to trace their family history back to founder populations from many centuries ago, and not only limit them to more recent migration events.
Unique Genetic Groups on MyHeritage
The outstanding resolution of Genetic Groups and the innovative technology that powers this feature mean that MyHeritage is now able to identify many populations that have never hitherto been detected by any consumer DNA test.
Here are a few examples of interesting and unique Genetic Groups that we now offer:
- Norway (Kvam and Bergen) — Norwegians in Norway (Kvam and Bergen, Vestland) and some of their descendants in the United States (Minnesota)
- Volga Germans in USA (Ellis, Kansas) — Volga German settlers from Southwestern Germany in Russia (Samara) and their descendants in the United States (Ellis County, Kansas)
- Acadian settlers in USA (Aroostook, Maine) and in Canada (Madawaska, New Brunswick, and Quebec) — Acadian settlers in the United States (Aroostook County, Maine) and in Canada (Madawaska County, New Brunswick, and some in Quebec)
- Italy (Potenza) — Italians in Italy (Potenza, Basilicata) and some of their descendants in the United States
- Netherlands (Friesland) — Dutch in Netherlands (Friesland)
- Ireland (Galway and Mayo) — Irish in Ireland (County Galway and County Mayo) and their descendants in the United States (Western Pennsylvania)
- Brazil (São Paulo) — Portuguese and Italian settlers from Portugal and Northeastern Italy in Brazil (São Paulo)
- Philippines (Ilocos and Central Luzon) — Filipinos in the Philippines (Ilocos and Central Luzon) and some of their descendants in the United States (Hawaii and California)
- Māori in New Zealand and in England — Māori in New Zealand and some in England
- Louisiana and Southeastern Texas — African Americans in the United States (Louisiana and some in Southeastern Texas)
- Eastern Morocco and Western Algeria — Moroccans and Algerians in Eastern Morocco and some in Western Algeria
- Sri Lanka — Sri Lankans in Sri Lanka
Accessing Genetic Groups on MyHeritage
To see your Genetic Groups, go to the “Ethnicity Estimate” section under the DNA tab on the navigation bar. We will also send an email announcement to every user with a DNA kit on MyHeritage, inviting them to view their Genetic Groups.
Genetic Groups are part of the Ethnicity Estimate in the DNA results, complementing the 42 high-level ethnicities with percentages. The high-level ethnicities have not changed in this update, but the Ethnicity Estimate page itself has a new and improved user interface. If you have Genetic Groups, they will appear nested under your ethnicities in the left-hand panel.
If a Genetic Group seems to have a prominent ethnicity for its members and the user has that ethnicity prominently too, the Genetic Group will be nested below that ethnicity in the results. Otherwise it will be listed separately at the bottom under the heading “Additional Genetic Group(s)”.
If you manage multiple DNA kits, you can easily switch from one to the other by using the drop-down menu next to your name at the top of the page.
Genetic Groups are displayed on the map as polygons with a thin border around them (as opposed to the ethnicities which are displayed without borders). In the example below, the user received an ethnicity result of 97.7% “Japanese and Korean”, but one of the Genetic Groups under it provides much higher resolution.
You may be a member of several Genetic Groups. If you hover over the names of the Genetic Groups on the left panel, or the polygon for that group on the map, you will also see a description and the confidence level for that group.
The new interface includes a helpful confidence slider above the list of ethnicities that allows you to select the confidence level of the Genetic Groups displayed for your DNA kit.
When you choose a level, we will display your Genetic Groups that are equal to or above that confidence level. There are three confidence levels: low, medium, and high. The default confidence level is the highest confidence level that displays at least three Genetic Groups.
If you choose low, you will see all Genetic Groups you are a member of, without discretion. If you choose medium, you will see Genetic Groups that you are a member of with medium to high probability. If you choose high, then you will only see Genetic Groups that you have a high probability of being a member of. In the above example, the default level of medium shows 4 out of 6 Genetic Groups. Moving the confidence level to low shows all 6 groups that this user belongs to, and moving the confidence slider to high shows only 1 of the 6 groups.
The confidence level is dependent on the quality and quantity of the segments that you share with the genetic data that represents each group. We have algorithms in place to assign each group in your results with the confidence level that determines the probability that you are part of that group. Since this analysis is segment based, we are contemplating showing these segments that associate you with a particular Genetic Group in the Chromosome Browser on MyHeritage in the future.
In some cases, where all Genetic Groups have the same confidence level, we will not display the confidence slider. For example, if you have four Genetic Groups, and all are medium confidence level, then we will not display the confidence slider above your ethnicity results, as it will be redundant.
Hierarchy of Genetic Groups and Ethnicities
Genetic Groups utilize a personalized hierarchy. This means that each Genetic Group that you are a member of will be nested under an ethnicity that best fits that Genetic Group and your own DNA results.
For example, the Genetic Group of Malta has a mix of Italian and North African heritage. If you are a member of the Maltese group, and have a large percentage of Italian ethnicity, the Maltese group will be nested under your Italian ethnicity. However, if you’re a member of the Maltese group and have North African heritage but no Italian origins, in this case the same Genetic Group will be nested under North African ethnicity.
In the example below, 5 high-confidence Genetic Groups were nested under the ethnicity Irish, Scottish, and Welsh, which is this user’s primary ethnicity:
If there is no predominant ethnicity that you share in common with a Genetic Group, it will be placed under a new category at the bottom of your Ethnicity Estimate results, called “Additional Genetic Groups” (or “Genetic Groups” if all your groups are listed there).
We will only place your Genetic Groups under a specific ethnicity if we are confident that it belongs there. For example, if you are part of the Maltese group, and you have very similar ethnicity percentages for both Italian and North African ethnicity (which are both common to this Genetic Group), then we cannot determine which one is more appropriate for the Genetic Group to be placed under, and instead your Maltese Genetic Group will appear under “Additional Genetic Groups”.
Genetic Groups are tied to a population rather than to any one ethnicity, and within that population there can be multiple ethnicities. Please note that the placement of a Genetic Group does not necessarily explain why you are a member of that group, and once ethnicities on MyHeritage are refined (see more on that below), the placement of the Genetic Group might change in your results.
Additional information about each Genetic Group
In addition to listing the Genetic Groups to which you belong, MyHeritage provides a drill-down page with additional genealogical insights about each group, including the top places its members lived during different time periods throughout the past few centuries, common ancestral surnames and given names, common ethnicities among the group’s members, and more.
Click on any Genetic Group’s name in the left-hand panel or on the polygon of the Genetic Group on the map, and a detailed page will display more information about that group. The default time period is 1900–1950 and you can select other time periods as you wish.
Use the arrows above the group name to move from group to group in your results.
Let’s drill down further to see what information is included in this page.
About this group
Under the group name, the description of the group is displayed.
You will also see the number of DNA kits used to form the group, and the number of DNA kits which were linked to family trees. The number of reference kits used to define each Genetic Group can help you understand which groups are more common among MyHeritage users. While some groups are large and based on thousands of kits, others are more exotic and are based on a small number of kits. Both can tell an equally compelling story, and be interesting to examine further.
Under that, the confidence level for this group will be listed in your results .
Check the “Expand all” box to show all the data that we have for each of the categories.
This section shows countries and cities around the world where members of the Genetic Group lived during a specific time period. You can drill down to view the group’s precise whereabouts during different time periods from the 17th century until today. This section corresponds with the map on the right which displays these places.
Use the “time period” selector above the map to see different distributions of places, both in the list on the left and on the map, for different time periods.
We plan to improve the list of Top places in the next few weeks so that it will have a more organized hierarchy and countries will receive more accurate “votes” from the cities beneath them, so that this section in the page will provide more useful information.
Members of Genetic Groups often have distinctive ancestral surnames that tell a lot about the group’s identity. If you have any of those ancestral surnames in your family tree, it may further corroborate your link to the group. The bars indicate how common these surnames are in the family trees used to form this group.
Common given names
Members of Genetic Groups can often have distinctive given names. The bars indicate how common these given names are in the family trees used to form this group.
Each Genetic Group consists of individuals from one or more ethnicities. It can be interesting to see which groups contain fewer ethnicities and are more homogeneous, and which are more mixed. The bars indicate the ethnicities common among the DNA kits used to form this group.
These are groups that have a strong affinity to the Genetic Group you are currently viewing. Each related group may have genetic similarities to your group due to geographic proximity and potential marriage between members of the two groups over generations.
Any related groups listed that you are also a member of are displayed with a colored icon. The related groups that you are not a member of will be shown with a grey icon.
Exploring all Genetic Groups
You can explore all 2,114 Genetic Groups on MyHeritage, including all those that you are not a member of. To do so, click “Show all available regions” at the bottom of the left-panel in the Ethnicity Estimate page. MyHeritage will then list all available Genetic Groups, beneath an ethnicity hierarchy, displaying how many Genetic Groups are found under each ethnicity.
Next, click any of the ethnicities listed. MyHeritage displays it in an enhanced page that includes a representative image (this is new in this update). Then the page will list the Genetic Groups you have under this ethnicity (if any) and further on, all Genetic Groups will be listed. Click any of them to explore it. You may just hover the mouse over it to get a description tooltip and see its main polygon on the map.
You can click the sound icon to play a short and sweet original tune that we composed for each ethnicity to get a taste of its culture.
Genetic Groups are available for free to anyone who has already tested with MyHeritage DNA. They have been rolled out to all our users in the Ethnicity Estimate page.
If you tested with another service and uploaded your DNA results to MyHeritage before December 16, 2018, you were grandfathered in and have free access to advanced DNA features forever, and you will be able to view Genetic Groups for free under your Ethnicity Estimate.
If you uploaded your raw DNA data to MyHeritage from another testing service after that date, and do not currently have access to the Ethnicity Estimate, you may pay a one-time unlock fee of $29 per kit and gain full access to advanced DNA features (AutoClusters, Theory of Family Relativity, Chromosome Browser, Shared ancestral places, and more), including Genetic Groups. Alternatively, you may purchase a MyHeritage Complete plan, which grants you access to advanced DNA features (including Genetic Groups) for all DNA kits that you manage, as well as many other benefits such as unlimited family tree size, access to MyHeritage’s 12.7 billion historical records, automatic Record Matches and Smart Matches, and much more. Learn more about our subscription plans here.
If you haven’t yet taken a MyHeritage DNA test, order your kit today.
If you have tested your DNA with another service and have not uploaded it yet to MyHeritage, you may upload it now. Then you can unlock your Genetic Groups which will be calculated for you overnight.
More detailed Genetic Groups
We plan to continue to enrich Genetic Groups on MyHeritage over time, and add new information about each Genetic Group using the wealth of information on MyHeritage. More importantly, we plan to re-calculate Genetic Groups in the future with even more reference DNA kits, perhaps all 4.5 million DNA kits that are currently in our database – which means that the resolution and the number of Genetic Groups supported will increase and improve.
But first, we would like to receive the community’s feedback and make improvements to the first release. If you have suggestions for improving one or more of your Genetic Groups, for example if you believe it is not accurately named or described, we’d love to hear from you. Please contact us at email@example.com and send us your feedback.
Better Ethnicity Estimates
The Genetic Groups update does not modify the top-level Ethnicity Estimates on MyHeritage – the 42 ethnicities with percentages, that we released 4 years ago. The ethnicity calculation is done separately from Genetic Groups using older technology that is outdated and in need of an overhaul. We are working on replacing the Ethnicity Estimates right now, and in 2021 plan to roll out much better ones based on totally new technology. Once we do that, Genetic Groups will improve too as their descriptions will become more accurate. The number of ethnicities will change and the percentages will become much more accurate. Stay tuned!
Genetic Groups are fascinating and can enhance genealogical research, giving us a renewed appreciation for our ancestors, the paths they traveled, and the lives they lived.
The Genetic Groups feature took three years of development to come to fruition. With a reference set of 1.7 million DNA kits, and a validation set of 2 million DNA kits, along with complex algorithms, we were able to identify and document 2,114 distinct groups, and then predict for all DNA kits on MyHeritage to which groups they belong. This feature makes the ethnicity reports on MyHeritage more useful, helping you zoom in with greater resolution on your ancestors’ geographic locations and migration patterns.
Our work is far from complete, and our next target within the realm of ethnicity is to replace the high-level percentage-based Ethnicity Estimates with better technology.
This work would not have been possible without you – our users. We look forward to serving you even better in the future.
Source: My Heritage