A new artificial neural-network architecture opens a window into the workings of a tool previously regarded as a black box.
Thanks to the extremely large datasets and computing power that have become available in recent years, a new paradigm in scientific discovery has emerged. This new approach is purely data driven, using large amounts of data to train machine-learning models―typically neural networks―to predict the behavior of the natural world [1]. The most prominent achievement of this new methodology has arguably been the AlphaFold model for predicting protein folding (see Research News: Chemistry Nobel Awarded for an AI System That Predicts Protein Structures) [2]. But despite such successes, these data-driven approaches suffer a major drawback in that they are generally “black boxes” that offer no human-accessible understanding of how they make their predictions. This shortcoming also extends to the models’ inputs: It is often desirable to build known domain knowledge into these models, but the data-driven approach excludes that option.









