Toggle light / dark theme

We’ve seen a lot about large learning models in general, and a lot of that has been elucidated at this conference, but many of the speakers have great personal takes on how this type of process works, and what it can do!

For example, here we have Yoon Kim talking about statistical objects, and the use of neural networks (transformer-based neural networks in particular) to use next-word prediction in versatile ways. He uses the example of the location of MIT:

“You might have a sentence like: ‘the Massachusetts Institute of Technology is a private land grant research university’ … and then you train this language model (around it),” he says. “Again, (it takes) a large neural network to predict the next word, which, in this case, is ‘Cambridge.’ And in some sense, to be able to accurately predict the next word, it does require this language model to store knowledge of the world, for example, that must store factoid knowledge, like the fact that MIT is in Cambridge. And it must store … linguistic knowledge. For example, to be able to pick the word ‘Cambridge,’ it must know what the subject, the verb and the object of the preceding or the current sentence is. But these are, in some sense, fancy autocomplete systems.”

The company aims for low-cost, high-throughput chips that allow users to work with its web services on the cloud.

Even as the world looks to Microsoft and Google to reveal the next big thing in the generative artificial intelligence (AI) field, Jeff Bezos-founded Amazon has been silently working to let its customers work directly with the technology. In an unmarked building in Austin, Texas, Amazon engineers are busy developing two types of microchips that will be used to train and run AI models, CNBC

The world took notice of generative AI when OpenAI launched ChatGPT last year. Microsoft, which has partnered with OpenAI previously, was quick to use its association with the company and incorporate the features of the AI model into its existing products.

Presented by VAST Data

With access to just a sliver of the 2.5 quintillion bytes of data created every day, AI produces what often seem like miracles that human intellect can’t match — identifying cancer on a medical scan, a viable embryo for IVF, new ways of tackling climate change and the opioid crisis and on and on. However, that’s not true intelligence; rather, these AI systems are just designed to link data points and report conclusions, to power increasingly disruptive automation across industries.

While generative AI is trending and GPT models have taken the world by storm with their astonishing capabilities to respond to human prompts, do they truly acquire the ability to perform reasoning tasks that humans find easy to execute? It’s important to understand that the current AI the world is working with has little understanding of the world it exists in, and is unable to build a mental model that goes beyond regurgitating information that is already known.

A groundbreaking theoretical proof reveals that using a technique called overparametrization enhances performance in quantum machine learning.

Machine learning is a subset of artificial intelligence (AI) that deals with the development of algorithms and statistical models that enable computers to learn from data and make predictions or decisions without being explicitly programmed to do so. Machine learning is used to identify patterns in data, classify data into different categories, or make predictions about future events. It can be categorized into three main types of learning: supervised, unsupervised and reinforcement learning.

In this video I discuss New Cerebras Supercomputer with Cerebras’s CEO Andrew Feldman.
Timestamps:
00:00 — Introduction.
02:15 — Why such a HUGE Chip?
02:37 — New AI Supercomputer Explained.
04:06 — Main Architectural Advantage.
05:47 — Software Stack NVIDIA CUDA vs Cerebras.
06:55 — Costs.
07:51 — Key Applications & Customers.
09:48 — Next Generation — WSE3
10:27 — NVIDIA vs Cerebras Comparison.

Mentioned Papers:
Massively scalable stencil algorithm: https://arxiv.org/abs/2204.03775
https://www.cerebras.net/blog/harnessing-the-power-of-sparsi…-ai-models.
https://www.cerebras.net/press-release/cerebras-wafer-scale-…ge-models/
Programming at Scale:
https://8968533.fs1.hubspotusercontent-na1.net/hubfs/8968533…tScale.pdf.
Massively Distributed Finite-Volume Flux Computation: https://arxiv.org/abs/2304.

Mentioned Video:
New CPU Technology: https://youtu.be/OcoZTDevwHc.

👉 Support me at Patreon ➜ https://www.patreon.com/AnastasiInTech.

M3GAN wasn’t malicious. It followed its programming, but without any care or respect for other beings—ultimately including Cady. In a sense, as it engaged with the physical world, M3GAN became an AI sociopath.

Sociopathic AI isn’t just a topic explored in Hollywood. To Dr. Leonardo Christov-Moore at the University of Southern California and colleagues, it’s high time we build artificial empathy into AI—and nip any antisocial behaviors in the bud.

In an essay published last week in Science Robotics, the team argued for a neuroscience perspective to embed empathy into lines of code. The key is to add “gut instincts” for survival—for example, the need to avoid physical pain. With a sense of how it may be “hurt,” an AI agent could then map that knowledge onto others. It’s similar to the way humans gauge each others’ feelings: I understand and feel your pain because I’ve been there before.