Microsoft’s Massive New Language AI Is Triple the Size of OpenAI’s GPT-3

😃

Microsoft’s blog post on Megatron-Turing says the algorithm is skilled at tasks like completion prediction, reading comprehension, commonsense reasoning, natural language inferences, and word sense disambiguation. But stay tuned—there will likely be more skills added to that list once the model starts being widely utilized.

GPT-3 turned out to have capabilities beyond what its creators anticipated, like writing code, doing math, translating between languages, and autocompleting images (oh, and writing a short film with a twist ending). This led some to speculate that GPT-3 might be the gateway to artificial general intelligence. But the algorithm’s variety of talents, while unexpected, still fell within the language domain (including programming languages), so that’s a bit of a stretch.

However, given the tricks GPT-3 had up its sleeve based on its 175 billion parameters, it’s intriguing to wonder what the Megatron-Turing model may surprise us with at 530 billion. The algorithm likely won’t be commercially available for some time, so it’ll be a while before we find out.

Blog