{"id":124254,"date":"2021-06-25T10:23:45","date_gmt":"2021-06-25T17:23:45","guid":{"rendered":"https:\/\/lifeboat.com\/blog\/2021\/06\/google-trains-two-billion-parameter-ai-vision-model"},"modified":"2021-06-25T10:23:45","modified_gmt":"2021-06-25T17:23:45","slug":"google-trains-two-billion-parameter-ai-vision-model","status":"publish","type":"post","link":"https:\/\/lifeboat.com\/blog\/2021\/06\/google-trains-two-billion-parameter-ai-vision-model","title":{"rendered":"Google Trains Two Billion Parameter AI Vision Model"},"content":{"rendered":"<p><a class=\"aligncenter blog-photo\" href=\"https:\/\/lifeboat.com\/blog.images\/google-trains-two-billion-parameter-ai-vision-model2.jpg\"><\/a><\/p>\n<p>Researchers at <a href=\"https:\/\/research.google\/teams\/brain\/\">Google Brain<\/a> announced a deep-learning computer vision (CV) model containing two billion parameters. The model was trained on three billion images and achieved 90.45% top-1 accuracy on ImageNet, setting a new state-of-the-art record.<\/p>\n<p>The team <a href=\"https:\/\/arxiv.org\/abs\/2106.04560\">described the model and experiments<\/a> in a paper published on arXiv. The model, dubbed ViT-G\/14, is based on Google\u2019s recent work on <a href=\"https:\/\/ai.googleblog.com\/2020\/12\/transformers-for-image-recognition-at.html\">Vision Transformers<\/a> (ViT). ViT-G\/14 outperformed previous state-of-the-art solutions on several benchmarks, including <a href=\"https:\/\/www.image-net.org\/\">ImageNet<\/a>, <a href=\"https:\/\/imagenetv2.org\/\">ImageNet-v2<\/a>, and <a href=\"https:\/\/ai.googleblog.com\/2019\/11\/the-visual-task-adaptation-benchmark.html\">VTAB-1k<\/a>. On the few-shot image recognition task, the accuracy improvement was more than five percentage-points. The researchers also trained several smaller versions of the model to investigate a scaling law for the architecture, noting that the performance follows a power-law function, similar to Transformer models used for natural language processing (NLP) tasks.<\/p>\n<p>First described by Google researchers in 2017, the <a href=\"https:\/\/dl.acm.org\/doi\/10.5555\/3295222.3295349\">Transformer architecture<\/a> has become the leading design for NLP deep-learning models, with OpenAI\u2019s <a href=\"https:\/\/www.infoq.com\/news\/2020\/06\/openai-gpt3-language-model\/\">GPT-3<\/a> being one of the most famous. Last year, OpenAI published a paper describing <a href=\"https:\/\/www.infoq.com\/news\/2020\/04\/scaling-laws-language-models\/\">scaling laws<\/a> for these models. By training many similar models of different sizes and varying the amount of training data and computing power, OpenAI determined a power-law function for estimating a model\u2019s accuracy. In addition, OpenAI found that not only do large models perform better, they are also more compute-efficient.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Researchers at Google Brain announced a deep-learning computer vision (CV) model containing two billion parameters. The model was trained on three billion images and achieved 90.45% top-1 accuracy on ImageNet, setting a new state-of-the-art record. The team described the model and experiments in a paper published on arXiv. The model, dubbed ViT-G\/14, is based on [\u2026]<\/p>\n","protected":false},"author":396,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6,1491],"tags":[],"class_list":["post-124254","post","type-post","status-publish","format-standard","hentry","category-robotics-ai","category-transportation"],"_links":{"self":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/124254","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/users\/396"}],"replies":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/comments?post=124254"}],"version-history":[{"count":0,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/124254\/revisions"}],"wp:attachment":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/media?parent=124254"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/categories?post=124254"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/tags?post=124254"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}