While early language models could only process text, contemporary large language models now perform highly diverse tasks on different types of data. For instance, LLMs can understand many languages, generate computer code, solve math problems, or answer questions about images and audio.
MIT researchers probed the inner workings of LLMs to better understand how they process such assorted data, and found evidence that they share some similarities with the human brain.
Neuroscientists believe the human brain has a “semantic hub” in the anterior temporal lobe that integrates semantic information from various modalities, like visual data and tactile inputs. This semantic hub is connected to modality-specific “spokes” that route information to the hub. The MIT researchers found that LLMs use a similar mechanism by abstractly processing data from diverse modalities in a central, generalized way. For instance, a model that has English as its dominant language would rely on English as a central medium to process inputs in Japanese or reason about arithmetic, computer code, etc. Furthermore, the researchers demonstrate that they can intervene in a model’s semantic hub by using text in the model’s dominant language to change its outputs, even when the model is processing data in other languages.