An artificial intelligence model has been used by scientists at Meta, the parent company of Facebook and Instagram, to predict the structures of hundreds of millions of genes.

The program used a model that was originally designed for decoding human languages to make accurate predictions of twists and turns. The predictions, which were compiled into the open-source E SM Metagenomic Atlas, could be used to help develop new drugs.

It's not the first program that makes predictions. AlphaFold, a program from DeepMind, was able to decipher the shapes of 200 millionProteins. The program is 60 times faster than DeepMind's program. Peer-reviewed results have not yet been achieved.

Scientists from DeepMind won a $3 million 'breakthrough prize' for their artificial intelligence.

RECOMMENDED VIDEOS FOR YOU...

The ESM Metagenomic Atlas will allow scientists to search and analyze the structures of metagenomic proteins at the scale of hundreds of millions of proteins. Researchers can use this to find structures that have not been characterized before, search for distant evolutionary relationships, and discover new proteins that can be useful in medicine and other applications.

The building blocks of all living things are made up of long, winding chains of amino acids that snap together in a variety of combinations to form the 3D shape of the molecule.

Knowing aProtein's shape is the best way to understand its function but there are a lot of ways the same combination of amino acids in different sequence can take shape The number of possible configurations is around 10300. X-ray crystallography is the gold standard way to determine aProtein's structure is using X-ray crystallography is the gold standard way to determine aProtein's structure is using X-ray crystallography is the gold standard way to Determine aProtein's structure More than 100,000 structures have been deciphered using X-ray crystallography.

To find a way around this problem, the Meta researchers turned to a computer model designed to decode and make predictions about human languages, and used it to make predictions about the language ofProtein Sequences.

The researchers used a form of self-supervised learning called masked language modeling to train a language model. This approach requires the model to fill in the blanks in a passage of text. A language model was trained to fill in the blanks in a sequence of millions of different genes. This training gave us information about the structure and function of the human body.

The scientists used a database of metagenomic DNA taken from places as diverse as soil, seawater and the human gut and skin. In just two weeks, the researchers predicted the structures of over 617 million genes with the help of the E SMFold program.

That's over 400 million more than AlphaFold said it had figured out the structure of almost every knownprotein. Many of these genes have never been seen before because they come from unknown organisms. The program has been able to predict the shapes with an accuracy down to the level of atoms.

The researchers would like to use this program for more work focused on the human body. We are studying how language models can be used to help solve challenges in health, disease, and the environment.