{"id":174750,"date":"2023-10-26T01:23:46","date_gmt":"2023-10-26T06:23:46","guid":{"rendered":"https:\/\/lifeboat.com\/blog\/2023\/10\/human-like-systematic-generalization-through-a-meta-learning-neural-network"},"modified":"2023-10-26T01:23:46","modified_gmt":"2023-10-26T06:23:46","slug":"human-like-systematic-generalization-through-a-meta-learning-neural-network","status":"publish","type":"post","link":"https:\/\/lifeboat.com\/blog\/2023\/10\/human-like-systematic-generalization-through-a-meta-learning-neural-network","title":{"rendered":"Human-like systematic generalization through a meta-learning neural network"},"content":{"rendered":"<p><a class=\"aligncenter blog-photo\" href=\"https:\/\/lifeboat.com\/blog.images\/human-like-systematic-generalization-through-a-meta-learning-neural-network2.jpg\"><\/a><\/p>\n<p>The power of human language and thought arises from systematic compositionality\u2014the algebraic ability to understand and produce novel combinations from known components. Fodor and Pylyshyn1 famously argued that artificial neural networks lack this capacity and are therefore not viable models of the mind. Neural networks have advanced considerably in the years since, yet the systematicity challenge persists. Here we successfully address Fodor and Pylyshyn\u2019s challenge by providing evidence that neural networks can achieve human-like systematicity when optimized for their compositional skills. To do so, we introduce the meta-learning for compositionality (MLC) approach for guiding training\u2026 More.<\/p>\n<hr>\n<p>Over 35 years ago, when Fodor and Pylyshyn raised the issue of systematicity in neural networks<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 1\" title=\"Fodor, J. A. & Pylyshyn, Z. W. Connectionism and cognitive architecture: a critical analysis. Cognition 28, 3&ndash;71 (1988).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR1\" id=\"ref-link-section-d262338427e1294\">1<\/a><\/sup>, today\u2019s models<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 19\" title=\"Brown, T. B. et al. Language models are few-shot learners. In Proc. Advances in Neural Information Processing Systems 33 (NeurIPS) (eds Larochelle, H. et al.) 1877&ndash;1901 (Curran Associates, 2020).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR19\" id=\"ref-link-section-d262338427e1298\">19<\/a><\/sup> and their language skills were probably unimaginable. As a credit to Fodor and Pylyshyn\u2019s prescience, the systematicity debate has endured. Systematicity continues to challenge models<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Lake, B. M. & Baroni, M. Generalization without systematicity: on the compositional skills of sequence-to-sequence recurrent networks. In Proc. International Conference on Machine Learning (ICML) (eds. Dy, J. & Krause, A.) 2873&ndash;2882 (PMLR, 2018).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR11\" id=\"ref-link-section-d262338427e1302\">11<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Ettinger, A., Elgohary, A., Phillips, C. & Resnik, P. Assessing composition in sentence vector representations. In Proc. 7th International Conference on Computational Linguistics, (COLING 2018) 1790&ndash;1801 (Association for Computational Linguistics, 2018).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR12\" id=\"ref-link-section-d262338427e1302_1\">12<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Bahdanau, D. et al. CLOSURE: assessing systematic generalization of CLEVR models. In Proc. NAACL Workshop on Visually Grounded Interaction and Language (ViGIL) (2019).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR13\" id=\"ref-link-section-d262338427e1302_2\">13<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Keysers, D. et al. Measuring compositional generalization: a comprehensive method on realistic data. In Proc. International Conference on Learning Representations (ICLR) (2019).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR14\" id=\"ref-link-section-d262338427e1302_3\">14<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Yu, L. & Ettinger, A. Assessing phrasal representation and composition in transformers. In Proc. Conference on Empirical Methods in Natural Language Processing (EMNLP) 4896&ndash;4907 (Association for Computational Linguistics, 2020).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR15\" id=\"ref-link-section-d262338427e1302_4\">15<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Kim, N. & Linzen, T. COGS: a compositional generalization challenge based on semantic interpretation. In Proc. Conference on Empirical Methods in Natural Language Processing (EMNLP) 9087&ndash;9105 (2020).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR16\" id=\"ref-link-section-d262338427e1302_5\">16<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Hupkes, D., Dankers, V., Mul, M. & Bruni, E. Compositionality decomposed: how do neural networks generalize? J. Artif. Int. Res. 67757&ndash;795 (2020).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR17\" id=\"ref-link-section-d262338427e1302_6\">17<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 18\" title=\"Press, O. et al. Measuring and narrowing the compositionality gap in language models. Preprint at https:\/\/arxiv.org\/abs\/2210.03350 (2022).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR18\" id=\"ref-link-section-d262338427e1305\">18<\/a><\/sup> and motivates new frameworks<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Collins, A. G. E. & Frank, M. J. Cognitive control over learning: creating, clustering, and generalizing task-set structure. Psychol. Rev. 120190&ndash;229 (2013).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR34\" id=\"ref-link-section-d262338427e1309\">34<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Chen, X., Liang, C., Yu, A. W., Song, D. & Zhou, D. Compositional generalization via neural-symbolic stack machines. In Proc. Advances in Neural Information Processing Systems 33 (eds Larochelle, H. et al.) 1690&ndash;1701 (Curran Associates, 2020).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR35\" id=\"ref-link-section-d262338427e1309_1\">35<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Russin, J., Jo, J., O\u2019Reilly, R. C. & Bengio, Y. Systematicity in a recurrent neural network by factorizing syntax and semantics. In Proc. 42nd Annual Meeting of the Cognitive Science Society (eds Denison, S. et al.) (Cognitive Science Society. 2020).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR36\" id=\"ref-link-section-d262338427e1309_2\">36<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Liu, Q. et al. Compositional generalization by learning analytical expressions. Adv. Neural Inf. Proces. Syst. 33, 11416&ndash;1142 (2020).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR37\" id=\"ref-link-section-d262338427e1309_3\">37<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Nye, M. I., Solar-Lezama, A., Tenenbaum, J. B. & Lake, B. M. Learning compositional rules via neural program synthesis. In Proc. Advances in Neural Information Processing Systems (NeurIPS) 33 (eds Larochelle, H. et al.) (Curran Associates, 2020).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR38\" id=\"ref-link-section-d262338427e1309_4\">38<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Singh, G., Deng, F. & Ahn, S. Illiterate DALL-E learns to compose. In Proc. ICLR https:\/\/openreview.net\/group?id=ICLR.cc\/2022\/Conference (2022).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR39\" id=\"ref-link-section-d262338427e1309_5\">39<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Smolensky, P., McCoy, R. T., Fernandez, R., Goldrick, M. & Gao, J. Neurocompositional computing: from the central paradox of cognition to a new generation of AI systems. AI Mag. (2022).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR40\" id=\"ref-link-section-d262338427e1309_6\">40<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 41\" title=\"Zhou, D. et al. Least-to-most prompting enables complex reasoning in large language models. In Proc. ICLR https:\/\/openreview.net\/group?id=ICLR.cc\/2023\/Conference (2023).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR41\" id=\"ref-link-section-d262338427e1312\">41<\/a><\/sup>. Preliminary experiments reported in Supplementary Information <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#MOESM1\">3<\/a> suggest that systematicity is still a challenge, or at the very least an open question, even for recent large language models such as GPT-4. To resolve the debate, and to understand whether neural networks can capture human-like compositional skills, we must compare humans and machines side-by-side, as in this Article and other recent work<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 7\" title=\"Nam, A. J. & McClelland, J. L. What underlies rapid learning and systematic generalization in humans? Preprint at http:\/\/arxiv.org\/abs\/2107.06994 (2021).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR7\" id=\"ref-link-section-d262338427e1320\">7<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 42\" title=\"Franklin, N. T. & Frank, M. J. Generalizing to generalize: humans flexibly switch between compositional and conjunctive structures during reinforcement learning. PLoS Comput. Biol. 16, e1007720 (2020).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR42\" id=\"ref-link-section-d262338427e1323\">42<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 43\" title=\"Dekker, R. B., Otto, F. & Summerfield, C. Curriculum learning for human compositional generalization. Proc. Natl Acad. Sci. USA 119, e2205582119 (2022).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR43\" id=\"ref-link-section-d262338427e1326\">43<\/a><\/sup>. In our experiments, we found that the most common human responses were algebraic and systematic in exactly the ways that Fodor and Pylyshyn<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 1\" title=\"Fodor, J. A. & Pylyshyn, Z. W. Connectionism and cognitive architecture: a critical analysis. Cognition 28, 3&ndash;71 (1988).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR1\" id=\"ref-link-section-d262338427e1330\">1<\/a><\/sup> discuss. However, people also relied on inductive biases that sometimes support the algebraic solution and sometimes deviate from it; indeed, people are not purely algebraic machines<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 3\" title=\"Johnson, K. On the systematicity of language and thought. J. Philos. 101111&ndash;139 (2004).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR3\" id=\"ref-link-section-d262338427e1334\">3<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 6\" title=\"O\u2019Reilly, R. C. et al. in The Architecture of Cognition: Rethinking Fodor and Pylyshyn\u2019s Systematicity Challenge (eds Calvo, P. & Symons, J.) 191&ndash;226 (MIT Press, 2014).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR6\" id=\"ref-link-section-d262338427e1337\">6<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 7\" title=\"Nam, A. J. & McClelland, J. L. What underlies rapid learning and systematic generalization in humans? Preprint at http:\/\/arxiv.org\/abs\/2107.06994 (2021).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR7\" id=\"ref-link-section-d262338427e1340\">7<\/a><\/sup>. We showed how MLC enables a standard neural network optimized for its compositional skills to mimic or exceed human systematic generalization in a side-by-side comparison. MLC shows much stronger systematicity than neural networks trained in standard ways, and shows more nuanced behaviour than pristine symbolic models. MLC also allows neural networks to tackle other existing challenges, including making systematic use of isolated primitives<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 11\" title=\"Lake, B. M. & Baroni, M. Generalization without systematicity: on the compositional skills of sequence-to-sequence recurrent networks. In Proc. International Conference on Machine Learning (ICML) (eds. Dy, J. & Krause, A.) 2873&ndash;2882 (PMLR, 2018).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR11\" id=\"ref-link-section-d262338427e1344\">11<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 16\" title=\"Kim, N. & Linzen, T. COGS: a compositional generalization challenge based on semantic interpretation. In Proc. Conference on Empirical Methods in Natural Language Processing (EMNLP) 9087&ndash;9105 (2020).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR16\" id=\"ref-link-section-d262338427e1347\">16<\/a><\/sup> and using mutual exclusivity to infer meanings<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 44\" title=\"Gandhi, K. & Lake, B. M. Mutual exclusivity as a challenge for deep neural networks. In Proc. Advances in Neural Information Processing Systems (NeurIPS) 33 (eds Larochelle, H. et al.) 14182&ndash;14192 (Curran Associates, 2020).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR44\" id=\"ref-link-section-d262338427e1351\">44<\/a><\/sup>.<\/p>\n<p>Our use of MLC for behavioural modelling relates to other approaches for reverse engineering human inductive biases. Bayesian approaches enable a modeller to evaluate different representational forms and parameter settings for capturing human behaviour, as specified through the model\u2019s prior<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 45\" title=\"Griffiths, T. L., Chater, N., Kemp, C., Perfors, A. & Tenenbaum, J. B. Probabilistic models of cognition: exploring representations and inductive biases. Trends Cogn. Sci. 14357&ndash;364 (2010).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR45\" id=\"ref-link-section-d262338427e1358\">45<\/a><\/sup>. These priors can also be tuned with behavioural data through hierarchical Bayesian modelling<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 46\" title=\"Kemp, C., Perfors, A. & Tenenbaum, J. B. Learning overhypotheses with hierarchical Bayesian models. Dev. Sci. 10307&ndash;321 (2007).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR46\" id=\"ref-link-section-d262338427e1362\">46<\/a><\/sup>, although the resulting set-up can be restrictive. MLC shows how meta-learning can be used like hierarchical Bayesian models for reverse-engineering inductive biases (see ref. <sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 47\" title=\"Grant, E., Finn, C., Levine, S., Darrell, T. & Griffiths, T. Recasting gradient-based meta-learning as hierarchical bayes. In Proc. International Conference on Learning Representations (ICLR) (2019).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR47\" id=\"ref-link-section-d262338427e1366\">47<\/a><\/sup> for a formal connection), although with the aid of neural networks for greater expressive power. Our research adds to a growing literature, reviewed previously<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 48\" title=\"Binz, M. et al. Meta-learned models of cognition. Preprint at http:\/\/arxiv.org\/abs\/2304.06729 (2023).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR48\" id=\"ref-link-section-d262338427e1370\">48<\/a><\/sup>, on using meta-learning for understanding human<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Grant, E., Peterson, J. C. & Griffiths, T. Learning deep taxonomic priors for concept learning from few positive examples. In Proc. Annual Meeting of the Cognitive Science Society (eds Goel, A. K. et al.) 1865&ndash;1870 (Cognitive Science Society, 2019).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR49\" id=\"ref-link-section-d262338427e1374\">49<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Dezfouli, A., Nock, R. & Dayan, P. Adversarial vulnerabilities of human decision-making. Proc. Natl Acad. Sci. USA 117, 29221&ndash;29228 (2020).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR50\" id=\"ref-link-section-d262338427e1374_1\">50<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 51\" title=\"Kumar, S., Dasgupta, I., Daw, N. D., Cohen, J. D. & Griffiths, T. L. Disentangling abstraction from statistical pattern matching in human and machine learning. PLoS Comput. Biol. 19, e1011316 (2023).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR51\" id=\"ref-link-section-d262338427e1377\">51<\/a><\/sup> or human-like behaviour<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D. & Lillicrap, T. Meta-learning with memory-augmented neural networks. In Proc. International Conference on Machine Learning (ICML) 1842&ndash;1850 (PMLR, 2016).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR52\" id=\"ref-link-section-d262338427e1382\">52<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Wang, J. et al. Learning to reinforcement learn. Preprint at https:\/\/arxiv.org\/abs\/1611.05763 (2017).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR53\" id=\"ref-link-section-d262338427e1382_1\">53<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 54\" title=\"McCoy, R. T., Grant, E., Smolensky, P., Griffiths, T. L. & Linzen, T. Universal linguistic inductive biases via meta-learning. In Proc. 42nd Annual Conference of the Cognitive Science Society (eds Denison, S. et al.) (Cognitive Science Society, 2020).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR54\" id=\"ref-link-section-d262338427e1385\">54<\/a><\/sup>. In our experiments, only MLC closely reproduced human behaviour with respect to both systematicity and biases, with the MLC (joint) model best navigating the trade-off between these two blueprints of human linguistic behaviour. Furthermore, MLC derives its abilities through meta-learning, where both systematic generalization and the human biases are not inherent properties of the neural network architecture but, instead, are induced from data.<\/p>\n<p>Despite its successes, MLC does not solve every challenge raised in Fodor and Pylyshyn<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 1\" title=\"Fodor, J. A. & Pylyshyn, Z. W. Connectionism and cognitive architecture: a critical analysis. Cognition 28, 3&ndash;71 (1988).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR1\" id=\"ref-link-section-d262338427e1392\">1<\/a><\/sup>. MLC does not automatically handle unpractised forms of generalization or concepts outside the meta-learning distribution, reducing the scope of entirely novel structures it can correctly process (compare the encouraging results on learning novel rules reported in Supplementary Information <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#MOESM1\">1<\/a>, with its failure on the SCAN and COGS productivity splits). Moreover, MLC is failing to generalize to nuances in inductive biases that it was not optimized for, as we explore further through an additional behavioural and modelling experiment in Supplementary Information <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#MOESM1\">2<\/a>. In the language of machine learning, we conclude that the meta-learning strategy succeeds when generalization makes a new episode in-distribution with respect to the training episodes, even when the specific test items are out-of-distribution with respect to the study examples in the episode. However, meta-learning alone will not allow a standard network to generalize to episodes that are in turn out-of-distribution with respect to the ones presented during meta-learning. The current architecture also lacks a mechanism for emitting new symbols<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 2\" title=\"Marcus, G. F. The Algebraic Mind: Integrating Connectionism and Cognitive Science (MIT Press, 2003).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR2\" id=\"ref-link-section-d262338427e1402\">2<\/a><\/sup>, although new symbols introduced through the study examples could be emitted through an additional pointer mechanism<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 55\" title=\"Vinyals, O., Fortunato, M. & Jaitly, N. Pointer networks. In Proc. Advances in Neural Information Processing Systems (eds Cortes, C. et al.) (Curran Associates, 2015).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR55\" id=\"ref-link-section-d262338427e1406\">55<\/a><\/sup>. Last, MLC is untested on the full complexity of natural language and on other modalities; therefore, whether it can achieve human-like systematicity, in all respects and from realistic training experience, remains to be determined. Nevertheless, our use of standard transformers will aid MLC in tackling a wider range of problems at scale. For example, a large language model could receive specialized meta-training<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 56\" title=\"Chen, Y., Zhong, R., Zhan, S., Karypis, G. & He, H. Meta-learning via language model in-context tuning. In Proc. 60th Annual Meeting of the Association for Computational Linguistics (ACL) 719&ndash;730 (Association for Computational Linguistics, 2022).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR56\" id=\"ref-link-section-d262338427e1411\">56<\/a><\/sup>, optimizing its compositional skills by alternating between standard training (next word prediction) and MLC meta-training that continually introduces novel words and explicitly improve systematicity (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#Fig1\">1<\/a>). For vision problems, an image classifier or generator could similarly receive specialized meta-training (through current prompt-based procedures<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 57\" title=\"Ramesh, A., Dhariwal, P., Nichol, A., Chu, C. & Chen, M. Hierarchical text-conditional image generation with CLIP latents. Preprint at https:\/\/arxiv.org\/abs\/2204.06125 (2022).\" href=\"https:\/\/www.nature.com\/articles\/s41586-023-06668-3#ref-CR57\" id=\"ref-link-section-d262338427e1418\">57<\/a><\/sup>) to learn how to systematically combine object features or multiple objects with relations.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The power of human language and thought arises from systematic compositionality\u2014the algebraic ability to understand and produce novel combinations from known components. Fodor and Pylyshyn1 famously argued that artificial neural networks lack this capacity and are therefore not viable models of the mind. Neural networks have advanced considerably in the years since, yet the systematicity [\u2026]<\/p>\n","protected":false},"author":359,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[],"class_list":["post-174750","post","type-post","status-publish","format-standard","hentry","category-robotics-ai"],"_links":{"self":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/174750","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/users\/359"}],"replies":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/comments?post=174750"}],"version-history":[{"count":0,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/174750\/revisions"}],"wp:attachment":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/media?parent=174750"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/categories?post=174750"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/tags?post=174750"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}