{"id":108115,"date":"2020-06-03T09:03:11","date_gmt":"2020-06-03T16:03:11","guid":{"rendered":"https:\/\/lifeboat.com\/blog\/2020\/06\/openai-announces-gpt-3-ai-language-model-with-175-billion-parameters"},"modified":"2020-06-03T09:03:11","modified_gmt":"2020-06-03T16:03:11","slug":"openai-announces-gpt-3-ai-language-model-with-175-billion-parameters","status":"publish","type":"post","link":"https:\/\/lifeboat.com\/blog\/2020\/06\/openai-announces-gpt-3-ai-language-model-with-175-billion-parameters","title":{"rendered":"OpenAI Announces GPT-3 AI Language Model with 175 Billion Parameters"},"content":{"rendered":"<p><a class=\"aligncenter blog-photo\" href=\"https:\/\/lifeboat.com\/blog.images\/openai-announces-gpt-3-ai-language-model-with-175-billion-parameters2.jpg\"><\/a><\/p>\n<p>A team of researchers from OpenAI recently <a href=\"https:\/\/arxiv.org\/abs\/2005.14165\">published a paper<\/a> describing GPT-3, a deep-learning model for natural-language with 175 billion parameters, 100x more than the previous version, GPT-2. The model is pre-trained on nearly half a trillion words and achieves state-of-the-art performance on several NLP benchmarks without fine-tuning.<\/p>\n<p>In paper published on arXiv, a team of over 30 co-authors described the model and several experiments. The researchers\u2019 goal was to produce an NLP system that performs well on a variety of tasks with little or no fine-tuning, and previous work had indicated that larger models might be the solution. To test that hypothesis, the team increased the size of their previous model, <a href=\"https:\/\/openai.com\/blog\/better-language-models\/\">GPT-2<\/a>, from 1.5 billion parameters to 175 billion. For training, the team collected several datasets, including the <a href=\"https:\/\/commoncrawl.org\/\">Common Crawl<\/a> dataset and the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Main_Page\">English-language Wikipedia<\/a>. The model was evaluated against several NLP benchmarks, matching state-of-the-art performance on \u201cclosed-book\u201d question-answering tasks and setting a new record for the <a href=\"https:\/\/arxiv.org\/abs\/1606.06031v1\">LAMBADA<\/a> language modeling task.<\/p>\n<p>OpenAI made headlines last year with GPT-2 and their decision not to release the 1.5 billion parameter version of the trained model due to \u201cconcerns about malicious applications of the technology.\u201d GPT-2 is one of many large-scale NLP models based on the <a href=\"https:\/\/www.infoq.com\/news\/2020\/02\/google-reformer-deep-learning\/\">Transformer<\/a> architecture. These models are <em>pre-trained<\/em> on large text corpora, such as the contents Wikipedia, using self-supervised learning. In this scenario, instead of using a dataset containing inputs paired with expected outputs, the model is given a sequence of text with words \u201cmasked\u201d and it must learn to predict the masked words based on the surrounding context. After this pre-training, the models are then fine-tuned with a labelled benchmark dataset for a particular NLP task, such as question-answering.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A team of researchers from OpenAI recently published a paper describing GPT-3, a deep-learning model for natural-language with 175 billion parameters, 100x more than the previous version, GPT-2. The model is pre-trained on nearly half a trillion words and achieves state-of-the-art performance on several NLP benchmarks without fine-tuning. In paper published on arXiv, a team [\u2026]<\/p>\n","protected":false},"author":513,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[],"class_list":["post-108115","post","type-post","status-publish","format-standard","hentry","category-robotics-ai"],"_links":{"self":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/108115","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/users\/513"}],"replies":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/comments?post=108115"}],"version-history":[{"count":0,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/108115\/revisions"}],"wp:attachment":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/media?parent=108115"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/categories?post=108115"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/tags?post=108115"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}