Toggle light / dark theme

Google DeepMind researchers have discovered 2.2mn crystal structures that open potential progress in fields from renewable energy to advanced computation, and show the power of artificial intelligence to discover novel materials.

The trove of theoretically stable but experimentally unrealised combinations identified using an AI tool known as GNoME is more than 45 times larger than the number of such substances unearthed in the history of science, according to a paper published in Nature on Wednesday.

The researchers plan to make 381,000 of the most promising structures available to fellow scientists to make and test their viability in fields from solar cells to superconductors. The venture underscores how harnessing AI can shortcut years of experimental graft — and potentially deliver improved products and processes.

An #AI tool that has discovered 2.2 million new materials, and helps to predict material stability.


AI tool GNoME finds 2.2 million new crystals, including 380,000 stable materials that could power future technologies.

Modern technologies from computer chips and batteries to solar panels rely on inorganic crystals. To enable new technologies, crystals must be stable otherwise they can decompose, and behind each new, stable crystal can be months of painstaking experimentation.

Today, in a paper published in Nature, we share the discovery of 2.2 million new crystals – equivalent to nearly 800 years’ worth of knowledge. We introduce Graph Networks for Materials Exploration (GNoME), our new deep learning tool that dramatically increases the speed and efficiency of discovery by predicting the stability of new materials.

“In your machine-learning project, how much time will you typically spend on data preparation and transformation?” asks a 2022 Google course on the Foundations of Machine Learning (ML). The two choices offered are either “Less than half the project time” or “More than half the project time.” If you guessed the latter, you would be correct; Google states that it takes over 80 percent of project time to format the data, and that’s not even taking into account the time needed to frame the problem in machine-learning terms.

“It would take many weeks of effort to figure out the appropriate model for our dataset, and this is a really prohibitive step for a lot of folks that want to use machine learning or biology,” says Jacqueline Valeri, a fifth-year PhD student of biological engineering in Collins’s lab who is first co-author of the paper.

BioAutoMATED is an automated machine-learning system that can select and build an appropriate model for a given dataset and even take care of the laborious task of data preprocessing, whittling down a months-long process to just a few hours. Automated machine-learning (AutoML) systems are still in a relatively nascent stage of development, with current usage primarily focused on image and text recognition, but largely unused in subfields of biology, points out first co-author and Jameel Clinic postdoc Luis Soenksen PhD ‘20.