Statistical physics is shedding light on how network architecture and data structure shape the effectiveness of neural-network learning.
Machine-learning technologies have profoundly reshaped many technical fields, with sweeping applications in medical diagnosis, customer service, drug discovery, and beyond. Central to this transformation are neural networks (NNs), models that learn patterns from data by combining many simple computational units, or neurons, linked by weighted connections. Acting collectively, these neurons can process data to learn complex input–output relationships. Despite their practical success, the fundamental mechanisms by which NNs learn remain poorly understood at a theoretical level. Statistical physics offers a promising framework for exploring central questions in machine-learning theory, potentially clarifying how learning depends on the layout of the network—the NN architecture—and on statistics of the data—the data structure (Fig. 1).
Three recent papers in a special Physical Review E collection (See Collection: Statistical Physics Meets Machine Learning — Machine Learning Meets Statistical Physics) provide significant insights into these questions. Francesca Mignacco of City University of New York and Princeton University and Francesco Mori of the University of Oxford in the UK derived analytical results on the optimal fraction of neurons that should be active at a given time [1]. Abdulkadir Canatar and SueYeon Chung of the Flatiron Institute in New York and New York University investigated the influence of the precision with which a network is “trained” on the amount of data the NN can reliably decode [2]. Francesco Cagnetta at the International School for Advanced Studies in Italy and colleagues showed that NNs whose structure mirrors that of the data learn faster [3].








