This approach significantly enhances performance, as observed in Atari video games and several other tasks involving multiple potential outcomes for each decision.
“They basically asked what happens if rather than just learning average rewards for certain actions, the algorithm learns the whole distribution, and they found it improved performance significantly,” explained Professor Drugowitsch.
In the latest study, Drugowitsch collaborated with Naoshige Uchida, a professor of molecular and cellular biology at Harvard University. The goal was to gain a better understanding of how the potential risks and rewards of a decision are weighed in the brain.