As we listen to speech, our brains actively compute the meaning of individual words. Inspired by the success of large language models (LLMs), we hypothesized that the brain employs vectorial coding principles, such that meaning is reflected in distributed activity of single neurons. We recorded responses of hundreds of neurons in the human hippocampus, which has a well-established role in semantic coding, while participants listened to narrative speech. We find encoding of contextual word meaning in the simultaneous activity of neurons whose individual selectivities span multiple unrelated semantic categories. Like embedding vectors in semantic models, distance between neural population responses correlates with semantic distance; however, this effect was only observed in contextual embedding models (like BERT) and was reversed in non-contextual embedding models (like Word2Vec), suggesting that the semantic distance effect depends critically on contextualization. Moreover, for the subset of highly semantically similar words, even contextual embedders showed an inverse correlation between semantic and neural distances; we attribute this pattern to the noise-mitigating benefits of contrastive coding. Finally, in further support for the critical role of context, we find that range of neural responses covaries with lexical polysemy. Ultimately, these results support the hypothesis that semantic coding in the hippocampus follows vectorial principles.
The authors have declared no competing interest.