At a conference at New York University in March, philosopher Raphaël Millière of Columbia University offered yet another jaw-dropping example of what LLMs can do. The models had already demonstrated the ability to write computer code, which is impressive but not too surprising because there is so much code out there on the Internet to mimic. Millière went a step further and showed that GPT can execute code, too, however. The philosopher typed in a program to calculate the 83rd number in the Fibonacci sequence. “It’s multistep reasoning of a very high degree,” he says. And the bot nailed it. When Millière asked directly for the 83rd Fibonacci number, however, GPT got it wrong: this suggests the system wasn’t just parroting the Internet. Rather it was performing its own calculations to reach the correct answer.
Although an LLM runs on a computer, it is not itself a computer. It lacks essential computational elements, such as working memory. In a tacit acknowledgement that GPT on its own should not be able to run code, its inventor, the tech company OpenAI, has since introduced a specialized plug-in—a tool ChatGPT can use when answering a query—that allows it to do so. But that plug-in was not used in Millière’s demonstration. Instead he hypothesizes that the machine improvised a memory by harnessing its mechanisms for interpreting words according to their context—a situation similar to how nature repurposes existing capacities for new functions.
This impromptu ability demonstrates that LLMs develop an internal complexity that goes well beyond a shallow statistical analysis. Researchers are finding that these systems seem to achieve genuine understanding of what they have learned. In one study presented last week at the International Conference on Learning Representations (ICLR), doctoral student Kenneth Li of Harvard University and his AI researcher colleagues—Aspen K. Hopkins of the Massachusetts Institute of Technology, David Bau of Northeastern University, and Fernanda Viégas, Hanspeter Pfister and Martin Wattenberg, all at Harvard—spun up their own smaller copy of the GPT neural network so they could study its inner workings. They trained it on millions of matches of the board game Othello by feeding in long sequences of moves in text form. Their model became a nearly perfect player.