Researchers from the University of Oxford's Big Data Institute have taken a major step towards mapping the entirety of genetic relationships among humans. The study has been published.

The past two decades have seen incredible advances in human genetic research, generating data for hundreds of thousands of individuals, including from thousands of prehistoric people. It is possible to trace the origins of human genetic diversity to create a complete map of how individuals across the world are related to each other.

The main challenges to this vision were working out a way to combine genome sequence from many different databases and developing a way to handle data of this size. A new method published today by researchers from the University of Oxford's Big Data Institute can easily combine data from multiple sources.

The Big Data Institute has built a huge family tree, a genealogy for all of humanity that models as exactly as we can the history that generated all the genetic variation we find in. This genealogy allows us to see how a person's genetic sequence relates to others.

The ancestry of each point on the genome can be thought of as a tree, since individual genomic regions are only inherited from one parent. The set of trees is known as a tree sequence or anancestral recombination graph.

The lead author of the research is a researcher at the Broad Institute of MIT and Harvard, who undertook the research as part of his PhD at the Big Data Institute. We can estimate when and where these ancestors lived. The power of our approach is that it makes very few assumptions about the underlying data and can include both modern and ancient DNA samples.

The study included a total of 3,609 individual genome sequences from 215 populations and integrated data on modern and ancient human genomes from eight different databases. The ancient genomes included samples from all over the world. Common ancestors are predicted to be present in the evolutionary trees. Almost 27 million ancestors were contained in the network.

The network was used by the authors to estimate where the predicted common ancestors had lived. The migration out of Africa was one of the key events in human evolutionary history.

The research team plans to make the genealogy map even more comprehensive by incorporating genetic data as it becomes available. The dataset could easily accommodate millions of additional genomes because of the efficient way in which tree sequences store data.

This study is laying the groundwork for the next generation. As the quality of genome sequence from modern and ancient DNA samples improves, the trees will become even more accurate and we will eventually be able to generate a single, unified map that explains the descent of all the human genetic variation we see today.

While humans are the focus of the study, the method is valid for most living things. It could be beneficial in medical genetics in distinguishing true associations between genetic regions and diseases from spurious connections arising from our shared ancestral history.

The story was told

The materials are provided by the University of Oxford. Content can be edited for style and length.