Serving GPT-2 at scale – Lifeboat News: The Blog

Over the last few years, the size of deep learning models has increased at an exponential pace (famously among language models):

And in fact, this chart is out of date. As of this month, OpenAI has announced GPT-3, which is a 175 billion parameter model—or roughly ten times the height of this chart.

As models grow larger, they introduce new infrastructure challenges. For my colleagues and I building Cortex (open source model serving infrastructure), these challenges are front and center, especially as the number of users deploying large models to production increases.