Optical computing has emerged as a powerful approach for high-speed and energy-efficient information processing. Diffractive optical networks, in particular, enable large-scale parallel computation through the use of passive structured phase masks and the propagation of light. However, one major challenge remains: systems trained in model-based simulations often fail to perform optimally in real experimental settings, where misalignments, noise, and model inaccuracies are difficult to capture.
In a new paper published in Light: Science & Applications, researchers at the University of California, Los Angeles (UCLA) introduce a model-free in situ training framework for diffractive optical processors, driven by proximal policy optimization (PPO), a reinforcement learning algorithm known for stability and sample efficiency. Rather than rely on a digital twin or the knowledge of an approximate physical model, the system learns directly from real optical measurements, optimizing its diffractive features on the hardware itself.
“Instead of trying to simulate complex optical behavior perfectly, we allow the device to learn from experience or experiments,” said Aydogan Ozcan, Chancellor’s Professor of Electrical and Computer Engineering at UCLA and the corresponding author of the study. “PPO makes this in situ process fast, stable, and scalable to realistic experimental conditions.”