{"id":123371,"date":"2021-06-03T19:22:20","date_gmt":"2021-06-04T02:22:20","guid":{"rendered":"https:\/\/lifeboat.com\/blog\/2021\/06\/chinas-gigantic-multi-modal-ai-is-no-one-trick-pony"},"modified":"2021-06-03T19:22:20","modified_gmt":"2021-06-04T02:22:20","slug":"chinas-gigantic-multi-modal-ai-is-no-one-trick-pony","status":"publish","type":"post","link":"https:\/\/lifeboat.com\/blog\/2021\/06\/chinas-gigantic-multi-modal-ai-is-no-one-trick-pony","title":{"rendered":"China\u2019s gigantic multi-modal AI is no one-trick pony"},"content":{"rendered":"<p><a class=\"aligncenter blog-photo\" href=\"https:\/\/lifeboat.com\/blog.images\/chinas-gigantic-multi-modal-ai-is-no-one-trick-pony.jpg\"><\/a><\/p>\n<p>When <a href=\" https:\/\/www.engadget.com\/tag\/OpenAI\">Open AI\u2019s<\/a> GPT-3 model made its debut in May of 2020, its performance was widely considered to be the literal state of the art. Capable of generating text indiscernible from human-crafted prose, GPT-3 set a new standard in deep learning. But oh what a difference a year makes. Researchers from the <a href=\" https:\/\/www.baai.ac.cn\/\" target=\"_blank\">Beijing Academy of Artificial Intelligence<\/a> announced on Tuesday the release of their own generative deep learning model, Wu Dao, a mammoth AI seemingly capable of doing everything GPT-3 can do, and more.<\/p>\n<p>First off, Wu Dao is flat out enormous. It\u2019s been trained on 1.75 trillion parameters (<a href=\" https:\/\/towardsdatascience.com\/neural-networks-parameters-hyperparameters-and-optimization-strategies-3f0842fac0a5\" target=\"_blank\">essentially, the model\u2019s self-selected coefficients<\/a>) which is a full ten times larger than the 175 billion GPT-3 was trained on and 150 billion parameters larger than Google\u2019s <a href=\" https:\/\/towardsdatascience.com\/the-switch-transformer-59f3854c7050\" target=\"_blank\">Switch Transformers<\/a>.<\/p>\n<p>In order to train a model on this many parameters and do so quickly \u2014 Wu Dao 2.0 arrived just three months after <a href=\" https:\/\/medium.com\/syncedreview\/chinas-gpt-3-baai-introduces-superscale-intelligence-model-wu-dao-1-0-98a573fc4d70\" target=\"_blank\">version 1.0\u2019s release in March<\/a> \u2014 the BAAI researchers first developed an open-source learning system akin to Google\u2019s <a href=\" https:\/\/research.google\/pubs\/pub45929\/\" target=\"_blank\">Mixture of Experts<\/a>, dubbed <a href=\" https:\/\/arxiv.org\/abs\/2103.13262\" target=\"_blank\">FastMoE<\/a>. This system, which is operable on <a href=\" https:\/\/www.engadget.com\/2018-05-02-facebook-open-source-ai-pytorch-tools-development.html\">PyTorch<\/a>, enabled the model to be trained both on clusters of supercomputers and conventional GPUs. This gave FastMoE more flexibility than Google\u2019s system since FastMoE doesn\u2019t require proprietary hardware like Google\u2019s TPUs and can therefore run on off-the-shelf hardware \u2014 supercomputing clusters notwithstanding.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>When Open AI\u2019s GPT-3 model made its debut in May of 2020, its performance was widely considered to be the literal state of the art. Capable of generating text indiscernible from human-crafted prose, GPT-3 set a new standard in deep learning. But oh what a difference a year makes. Researchers from the Beijing Academy of [\u2026]<\/p>\n","protected":false},"author":359,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6,44],"tags":[],"class_list":["post-123371","post","type-post","status-publish","format-standard","hentry","category-robotics-ai","category-supercomputing"],"_links":{"self":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/123371","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/users\/359"}],"replies":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/comments?post=123371"}],"version-history":[{"count":0,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/123371\/revisions"}],"wp:attachment":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/media?parent=123371"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/categories?post=123371"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/tags?post=123371"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}