{"id":164397,"date":"2023-05-23T07:26:37","date_gmt":"2023-05-23T12:26:37","guid":{"rendered":"https:\/\/lifeboat.com\/blog\/2023\/05\/metas-open-source-speech-ai-recognizes-over-4000-spoken-languages"},"modified":"2023-05-23T07:26:37","modified_gmt":"2023-05-23T12:26:37","slug":"metas-open-source-speech-ai-recognizes-over-4000-spoken-languages","status":"publish","type":"post","link":"https:\/\/lifeboat.com\/blog\/2023\/05\/metas-open-source-speech-ai-recognizes-over-4000-spoken-languages","title":{"rendered":"Meta\u2019s open-source speech AI recognizes over 4,000 spoken languages"},"content":{"rendered":"<p><a class=\"aligncenter blog-photo\" href=\"https:\/\/lifeboat.com\/blog.images\/metas-open-source-speech-ai-recognizes-over-4000-spoken-languages2.jpg\"><\/a><\/p>\n<p>Meta has created an AI language model that (in a refreshing change of pace) isn\u2019t a <a data-i13n=\"cpos:1;pos:1\" href=\"https:\/\/www.engadget.com\/chatgpt-is-suddenly-everywhere-are-we-ready-180031821.html\" data-ylk=\"elm: context_link;cpos:1;pos:1;itc:0\">ChatGPT<\/a> clone. The company\u2019s Massively Multilingual Speech (MMS) project can recognize over 4,000 spoken languages and produce speech (text-to-speech) in over 1,100. Like most of its <a data-i13n=\"cpos:2;pos:1\" href=\"https:\/\/www.engadget.com\/meta-ai-translate-200-language-real-time-nllb-130023464.html\" data-ylk=\"elm: context_link;cpos:2;pos:1;itc:0\">other<\/a> publicly announced <a data-i13n=\"cpos:3;pos:1\" href=\"https:\/\/www.engadget.com\/metas-open-source-imagebind-ai-aims-to-mimic-human-perception-181500560.html\" data-ylk=\"elm: context_link;cpos:3;pos:1;itc:0\">AI projects<\/a>, Meta is open-sourcing MMS today to help preserve language diversity and encourage researchers to build on its foundation. \u201cToday, we are publicly sharing our models and code so that others in the research community can build upon our work,\u201d the company wrote. \u201cThrough this work, we hope to make a small contribution to preserve the incredible language diversity of the world.\u201d<\/p>\n<p>Speech recognition and text-to-speech models typically require training on thousands of hours of audio with accompanying transcription labels. (Labels are crucial to machine learning, allowing the algorithms to correctly categorize and \u201cunderstand\u201d the data.) But for languages that aren\u2019t widely used in industrialized nations \u2014 many of which are in danger of disappearing in the coming decades \u2014 \u201cthis data simply does not exist,\u201d as Meta puts it.<\/p>\n<p>Meta used an unconventional approach to collecting audio data: tapping into audio recordings of translated religious texts. \u201cWe turned to religious texts, such as the Bible, that have been translated in many different languages and whose translations have been widely studied for text-based language translation research,\u201d the company said. \u201cThese translations have publicly available audio recordings of people reading these texts in different languages.\u201d Incorporating the unlabeled recordings of the Bible and similar texts, Meta\u2019s researchers increased the model\u2019s available languages to over 4,000.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Meta has created an AI language model that (in a refreshing change of pace) isn\u2019t a ChatGPT clone. The company\u2019s Massively Multilingual Speech (MMS) project can recognize over 4,000 spoken languages and produce speech (text-to-speech) in over 1,100. Like most of its other publicly announced AI projects, Meta is open-sourcing MMS today to help preserve [\u2026]<\/p>\n","protected":false},"author":367,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[41,6],"tags":[],"class_list":["post-164397","post","type-post","status-publish","format-standard","hentry","category-information-science","category-robotics-ai"],"_links":{"self":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/164397","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/users\/367"}],"replies":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/comments?post=164397"}],"version-history":[{"count":0,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/164397\/revisions"}],"wp:attachment":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/media?parent=164397"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/categories?post=164397"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/tags?post=164397"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}