Meta AI unlocks hundreds of millions of proteins to aid drug discovery

Facebook parent company meta platforms inc

has developed a tool to predict the structure of hundreds of millions of proteins using artificial intelligence. researchers say so promises to deepen scientists’ understanding of biology and perhaps speed up drug discovery.

Meta’s research arm, Meta AI, used the new AI-based computer program called ESMFold to create a public database of 617 million predicted proteins. Proteins are the building blocks of life and many drugs needed for tissues, organs, and cells to function.

Drugs based on proteins are used to treat heart disease, certain types of cancer and HIV, among other diseases, and many pharmaceutical companies have started to develop new drugs with artificial intelligence. Using AI to predict protein structures should not only increase the effectiveness of existing drugs and drug candidates, but also help discover molecules that could treat diseases that have previously been elusive to cure.

With ESMFold, Meta competes against another protein prediction computer model known as AlphaFold from DeepMind Technologies, a subsidiary of Google parent company Alphabet inc

AlphaFold said last year that its database contains 214 million predicted proteins that could help speed drug discovery.

According to Meta, ESMFold is 60x faster than AlphaFold but less accurate. The ESMFold database is larger because it made predictions from genetic sequences that had not previously been examined.

Predicting a protein’s structure can help scientists understand its biological function, according to Alexander Rives, co-author of a study published Thursday in the journal Science and a researcher at Meta AI. Meta previously published the paper describing ESMFold on a preprint server in November 2022.

“Often proteins with similar structures have similar biological functions,” said Dr. Rives. “And if you can have a really high-resolution structure, then you can start thinking about what the actual biochemical function of these proteins is.”

About a third of the proteins predicted by ESMFold can be performed with high confidence, according to Meta.

The quest to predict protein structure and then function has been ongoing for the past decade. Because proteins constantly fold and refold before forming their final structure, determining protein structures has been difficult and costly for scientists. Instead of using microscopes that can image protein structures at the atomic level, The new AI models learn to predict protein shapes in hours or days instead of months and years.

Meta-researchers generated the predictions using a form of AI known as a large Language model that can predict text from just a few letters or words. It’s the same technology that enables OpenAI’s ChatGPT to generate human-like responses.


How could ESMFold change the future of medicine? Join the conversation below.

The meta-scientists gave the ESMFold program a series of letters that represent the amino acids that make up a protein’s genetic code. The AI ​​model then learned how to fill in the sections in the sequence that were blank or hidden. Once a complete sequence has been generated, ESMFold could learn the relationship between known protein sequences and structures already well understood by scientists to predict the structures of new protein sequences.

Meta-scientists say the power of ESMFold lies in the speed with which it can predict protein structures, allowing researchers to search through large genetic databases to find possible applications in medicine, health, nutrition and the environment.

“This is a major achievement, but a lot depends on previous work,” said Olexandr Isayev, a computational biologist at Carnegie Mellon University who was not involved in the study.

A biotech executive says he prefers AlphaFold to ESMFold for accuracy. “The bottleneck isn’t computation, so faster isn’t better, better is more accurate,” said Chris Bahl, chief scientific officer and co-founder by AI Proteins, a Boston-based startup using artificial intelligence tools to develop synthetic proteins.

dr Rives said ESMFold is already being used by several academic research groups and biotech companies.

The ESMFold model has been downloaded about 250,000 times per month since its release in 2022, predicting 1,000 protein structures every hour, according to a meta spokeswoman.

Since AlphaFold was first released in 2021, According to DeepMind, more than a million researchers and biologists in over 190 countries have used the database to view three million protein structures.

“From what we’re seeing right now, protein language models like ESMFold aren’t quite accurate yet, yielding lower accuracy than models like AlphaFold,” said a spokeswoman for DeepMind. “However, we expect that in many cases there will be good predictions in the ESMFold database.”

DeepMind and Meta AI predictive models each have their strengths and will lead to new discoveries, according to Andrew Ferguson, co-founder of Chicago-based biotech company Evozyne and associate professor of molecular engineering at the University of Chicago.

“They complement each other,” said Dr. Ferguson, adding that the meta-AI model “was a really elegant idea.”

Evozyne partnered with technology company Nvidia corp

to develop a proprietary language model that skips the structure of a protein and can predict its biological function. Evozyne then used this model to design two proteins, according to a paper posted to a preprint server in January.

Write to Eric Niiler at

Copyright ©2022 Dow Jones & Company, Inc. All rights reserved. 87990cbe856818d5eddac44c7b1cdeb8

Leave a Reply

Your email address will not be published. Required fields are marked *