{"id":129621,"date":"2021-10-28T08:22:36","date_gmt":"2021-10-28T15:22:36","guid":{"rendered":"https:\/\/lifeboat.com\/blog\/2021\/10\/ginormous-new-index-shares-data-from-100-million-science-papers-for-free"},"modified":"2021-10-28T08:22:36","modified_gmt":"2021-10-28T15:22:36","slug":"ginormous-new-index-shares-data-from-100-million-science-papers-for-free","status":"publish","type":"post","link":"https:\/\/lifeboat.com\/blog\/2021\/10\/ginormous-new-index-shares-data-from-100-million-science-papers-for-free","title":{"rendered":"Ginormous New \u2018Index\u2019 Shares Data From 100 Million Science Papers For Free"},"content":{"rendered":"<p><a class=\"aligncenter blog-photo\" href=\"https:\/\/lifeboat.com\/blog.images\/ginormous-new-index-shares-data-from-100-million-science-papers-for-free3.jpg\"><\/a><\/p>\n<p>The general index is a collection of 100+ million scientific papers that can be downloaded in 38 Terabytes. It is structured and can be searched via code.<\/p>\n<hr>\n<p>There\u2019s a vast amount of research out there, with the volume growing rapidly with each passing day. But there\u2019s a problem.<\/p>\n<p>Not only is a lot of the existing literature hidden behind a paywall, but it can also be difficult to parse and make sense of in a comprehensive, logical way. What\u2019s really needed is a super-smart version of Google just for academic papers.<\/p>\n<p>Enter the <a href=\"https:\/\/archive.org\/details\/GeneralIndex\" target=\"_blank\" rel=\"noopener noreferrer\">General Index<\/a>, a new database of some 107.2 million journal articles, totaling 38 terabytes of data in its uncompressed form. It spans more than 355 billion rows of text, each featuring a key word or phrase plucked from a published paper.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The general index is a collection of 100+ million scientific papers that can be downloaded in 38 Terabytes. It is structured and can be searched via code. There\u2019s a vast amount of research out there, with the volume growing rapidly with each passing day. But there\u2019s a problem. Not only is a lot of the [\u2026]<\/p>\n","protected":false},"author":658,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1523,224],"tags":[],"class_list":["post-129621","post","type-post","status-publish","format-standard","hentry","category-computing","category-science"],"_links":{"self":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/129621","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/users\/658"}],"replies":[{"embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/comments?post=129621"}],"version-history":[{"count":0,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/posts\/129621\/revisions"}],"wp:attachment":[{"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/media?parent=129621"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/categories?post=129621"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/lifeboat.com\/blog\/wp-json\/wp\/v2\/tags?post=129621"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}