Toggle light / dark theme

Why downsizing large language models is the future of generative AI

Smaller language models can be based on a billion parameters or less—still pretty large, but much smaller than foundational LLMs like ChatGPT and Bard. They are pre-trained to understand vocabulary and human speech, so the incremental cost to customize them using corporate and industry-specific data is vastly lower. There are several options for these pre-trained LLMs that can be customized internally, including AI21 and Reka, as well as open source LLMs like Alpaca and Vicuna.

Smaller language models aren’t just more cost-efficient, they’re often far more accurate, because instead of training them on all publicly available data—the good and the bad—they are trained and optimized on carefully vetted data that addresses the exact use cases a business cares about.

That doesn’t mean they’re limited to internal corporate data. Smaller language models can incorporate third-party data about the economy, commodities pricing, the weather, or whatever data sets are needed, and combine them with their proprietary data sets. These data sources are widely available from data service providers who ensure the information is current, accurate, and clean.

Google’s AI search experience adds AI-powered summaries, definitions and coding improvements

Google today is rolling out a few new updates to its nearly three-month-old Search Generative Experience (SGE), the company’s AI-powered conversational mode in Search, with a goal of helping users better learn and make sense of the information they discover on the web. The features include tools to see definitions of unfamiliar terms, those that help to improve your understanding and coding information across languages, and an interesting feature that lets you tap into the AI power of SGE while you’re browsing.

The company explains that these improvements aim to help people better understand complicated concepts or complex topics, boost their coding skills and more.

One of the new features will let you hover over certain words to preview their definitions and see related images or diagrams related to the topic, which you can then tap on to learn more. This feature will become available across Google’s AI-generated responses to topics or questions related to certain subjects, like STEM, economics, history and others, where you may encounter terms you don’t understand or concepts you want to dive deeper into for a better understanding.

Drawing Stuff: AI Can Really Cook! How Far Can It Go?

We’ve seen a lot about large learning models in general, and a lot of that has been elucidated at this conference, but many of the speakers have great personal takes on how this type of process works, and what it can do!

For example, here we have Yoon Kim talking about statistical objects, and the use of neural networks (transformer-based neural networks in particular) to use next-word prediction in versatile ways. He uses the example of the location of MIT:

“You might have a sentence like: ‘the Massachusetts Institute of Technology is a private land grant research university’ … and then you train this language model (around it),” he says. “Again, (it takes) a large neural network to predict the next word, which, in this case, is ‘Cambridge.’ And in some sense, to be able to accurately predict the next word, it does require this language model to store knowledge of the world, for example, that must store factoid knowledge, like the fact that MIT is in Cambridge. And it must store … linguistic knowledge. For example, to be able to pick the word ‘Cambridge,’ it must know what the subject, the verb and the object of the preceding or the current sentence is. But these are, in some sense, fancy autocomplete systems.”

Amazon is making its own chips to offer generative AI on AWS

The company aims for low-cost, high-throughput chips that allow users to work with its web services on the cloud.

Even as the world looks to Microsoft and Google to reveal the next big thing in the generative artificial intelligence (AI) field, Jeff Bezos-founded Amazon has been silently working to let its customers work directly with the technology. In an unmarked building in Austin, Texas, Amazon engineers are busy developing two types of microchips that will be used to train and run AI models, CNBC

The world took notice of generative AI when OpenAI launched ChatGPT last year. Microsoft, which has partnered with OpenAI previously, was quick to use its association with the company and incorporate the features of the AI model into its existing products.

Creating the next wave of computing beyond large language models

Presented by VAST Data

With access to just a sliver of the 2.5 quintillion bytes of data created every day, AI produces what often seem like miracles that human intellect can’t match — identifying cancer on a medical scan, a viable embryo for IVF, new ways of tackling climate change and the opioid crisis and on and on. However, that’s not true intelligence; rather, these AI systems are just designed to link data points and report conclusions, to power increasingly disruptive automation across industries.

While generative AI is trending and GPT models have taken the world by storm with their astonishing capabilities to respond to human prompts, do they truly acquire the ability to perform reasoning tasks that humans find easy to execute? It’s important to understand that the current AI the world is working with has little understanding of the world it exists in, and is unable to build a mental model that goes beyond regurgitating information that is already known.

A Leap in Performance — New Breakthrough Boosts Quantum AI

A groundbreaking theoretical proof reveals that using a technique called overparametrization enhances performance in quantum machine learning.

Machine learning is a subset of artificial intelligence (AI) that deals with the development of algorithms and statistical models that enable computers to learn from data and make predictions or decisions without being explicitly programmed to do so. Machine learning is used to identify patterns in data, classify data into different categories, or make predictions about future events. It can be categorized into three main types of learning: supervised, unsupervised and reinforcement learning.

This New AI Supercomputer Outperforms NVIDIA! (with CEO Andrew Feldman)

In this video I discuss New Cerebras Supercomputer with Cerebras’s CEO Andrew Feldman.
Timestamps:
00:00 — Introduction.
02:15 — Why such a HUGE Chip?
02:37 — New AI Supercomputer Explained.
04:06 — Main Architectural Advantage.
05:47 — Software Stack NVIDIA CUDA vs Cerebras.
06:55 — Costs.
07:51 — Key Applications & Customers.
09:48 — Next Generation — WSE3
10:27 — NVIDIA vs Cerebras Comparison.

Mentioned Papers:
Massively scalable stencil algorithm: https://arxiv.org/abs/2204.03775
https://www.cerebras.net/blog/harnessing-the-power-of-sparsi…-ai-models.
https://www.cerebras.net/press-release/cerebras-wafer-scale-…ge-models/
Programming at Scale:
https://8968533.fs1.hubspotusercontent-na1.net/hubfs/8968533…tScale.pdf.
Massively Distributed Finite-Volume Flux Computation: https://arxiv.org/abs/2304.

Mentioned Video:
New CPU Technology: https://youtu.be/OcoZTDevwHc.

👉 Support me at Patreon ➜ https://www.patreon.com/AnastasiInTech.
📩 Sign up for my Deep In Tech Newsletter for free! ➜ https://anastasiintech.substack.com