What new cell biology can AI reveal just by looking at images? A lot!
AI learned how to recognize and classify different dog breeds from images. A new machine learning method from CZ Biohub now makes it possible to classify and compare different human proteins from fluorescence microscopy images. Credit: CZ Biohub

Humans like to look at images and find patterns. If you look at a collection of dog photos, you can sort them by ear size, face shape, and so on. Is it possible to compare them in a quantitative way? Is it possible for a machine to extract meaningful information from images that humans cannot?

A machine learning method has been developed by a group of scientists at Standford University. According to the report in Nature Methods, their algorithm, dubbed "cytoself," provides rich, detailed information on the location and function of the cell's proteins. The ability could be used to speed up the process of drug discovery and drug screening.

"We're applying artificial intelligence to a new kind of problem and still recovering everything that humans know, plus more," said Royer. We might be able to do this for different images in the future. A lot of possibilities are opened up by it.

The power of machine- learning has been demonstrated by the fact that it has generated insights into cells, the basic building blocks of life. The cell is much more organized than we thought. It's an important biological result about how the human cell is wired.

The open source nature of the tool makes it accessible to all. Leonetti hopes that it's going to inspire a lot of people to solve their own image analysis problems.

Machines are capable of learning on their own.

Supervised learning and self-supervised learning are both examples of what is known as self-supervised learning. Hirofumi Kobayashi, lead author of the study, said supervised learning is a lot of work and very tedious. bias can be introduced into the system if the machine is limited to certain categories.

The information was already in the pictures. We wanted to find out what the machine could do on its own.

The team, which also included a software engineer, were surprised by the amount of information the program was able to extract.

Leonetti's group is developing tools and technologies for understanding cell architecture. Each image is transformed into a mathematical file. You can rank images that look the same. It was kind of surprising that we were able to predict with high specificity the proteins that work together in the cell just by looking at their images.

In this rotating 3D UMAP image, each point represents a single protein image, colored according to protein localization categories. Collectively it forms a highly detailed map of the full diversity of protein localizations. Credit: CZ Biohub

It's the first of its kind.

It is the first time that self-supervised learning has been used successfully on such a large dataset of over 1 million images.

The images were part of a project led by Leonetti that would create a complete map of the human cell. The first 1,310 proteins they characterized were published earlier this year in Science.

All of the images at opencell.czbiohub.org provide very detailed and quantitative information on the structure and function of the proteins.

Royer said that the question of what are all the possible ways a cell's nucleus can hold a molecule is fundamental. Over the years, biologists have tried to establish all the possible places and structures within a cell. Humans look at the data. How much have human biases made this process flawed?

Royer said that machines can do it better than humans can. They can see the differences in the images that are very fine.

The next goal for the team is to track how small changes in the structure of the cell can be used to distinguish between normal and cancer cells. This could be the key to better understanding of many diseases.

Drug screening is like a trial and error. This is a big jump because you won't need to do many experiments with the same amount ofProteins. It's a low-cost way to increase research speed.

More information: Hirofumi Kobayashi et al, Self-supervised deep learning encodes high-resolution features of protein subcellular localization, Nature Methods (2022). DOI: 10.1038/s41592-022-01541-z Journal information: Science , Nature Methods