Ginormous New 'Index' Shares Data From 100 Million Science Papers For Free

Mike Letterman

Oct 28, 2021, 7:15 PM
171 Views

Ginormous New 'Index' Shares Data From 100 Million Science Papers For Free

There is a lot of research available, and the volume keeps growing every day. There is a problem.
Many of the literature is hidden behind paywalls, making it difficult to understand and parse in a complete, logical manner. A super-smart Google version is what's needed for academic papers.

The General Index is a database that contains 38 terabytes uncompressed data and includes 107.2 million journal articles. It contains more than 355 million rows of text and each row includes a key word/phrase taken from a published paper.

Carl Malamud, an Archivist, said that the Index is a "look-up tool", a dictionary of knowledge and a "map to knowledge". "A tool we consider essential for science practice in modern times."

Although we have mentioned Google, it is not a search engine. Instead, scientists who use the General Index will need to create their own search engines. It's actually a structured and carefully catalogued catalog that can be used for decades of scientific research.

Its primary function is to aid in text mining. This involves computers scanning millions of data points quickly to locate and cross-link specific references. A computer program linked to the General Index can read and extract key pieces of information from millions of journal articles. Humans cannot.

Other scientists have also expressed positive reactions. Gitanjali Yadav, a computational biologist at the University of Cambridge, UK, believes that the new database will help to solve the problem of restricted access.

Yadav stated to Nature that there is no way for him or anyone else experimentally to analyze or measure each plant species' chemical fingerprint. "Much of what we need is already in the published literature."

Although the General Index is intended to be used for searching for plants, chemicals and genes, proteins and materials, the team behind it insists that it still needs to be improved and expanded.

This information can be downloaded and used for free at the General Index portal. There is no copyright and no restrictions. The Index only contains snippets from papers and not the actual papers. To make sense of this information, however, you will need to be able to code.

The Index doesn't host papers in entirety, unlike Sci-Hub's controversial portal. However, questions have been raised about the legality of the project. Malamud believes that the project is within legal limits.

Malamud stated that he is confident that the work he did was legal. "We aren't doing this to cause a lawsuit. We are doing it to advance scientific knowledge."

Ginormous New 'Index' Shares Data From 100 Million Science Papers For Free

Mike Letterman

Tech

Buzz

Travel

Health

Related News

Patrick Mahomes, Kansas City Chiefs prevail against Buffalo Bills, win wild AFC divisional game in overtime

Popular on TVN

Stay In Touch

Follow Us