Machine learning models are designed to take in data, to find patterns or relationships within those data, and to use what they have learned to make predictions or to create new content. The quality of those outputs depends not only on the details of a model’s inner workings but also, crucially, on the information that is fed into the model.
Some models follow a brute force approach, essentially adding every bit of data related to a particular problem into the model and seeing what comes out. But a sleeker, less energy-hungry way to approach a problem is to determine which variables are vital to the outcome and only provide the model with information about those key variables.
Now, Adrián Lozano-Durán, an associate professor of aerospace at Caltech and a visiting professor at MIT, and MIT graduate student Yuan Yuan, have developed a theorem that takes any number of possible variables and whittles them down, leaving only those that are most important. In the process, the model removes all units, such as meters and feet, from the underlying equations, making them dimensionless, something scientists require of equations that describe the physical world. The work can be applied not only to machine learning but to any mathematical model.








