Toggle light / dark theme

The world of artificial intelligence (AI) has made remarkable strides in recent years, particularly in understanding human language. At the heart of this revolution is the Transformer model, a core innovation that allows large language models (LLMs) to process and understand language with an efficiency that previous models could only dream of. But how do Transformers work? To explain this, let’s take a journey through their inner workings, using stories and analogies to make the complex concepts easier to grasp.

Apple’s latest machine learning research could make creating models for Apple Intelligence faster, by coming up with a technique to almost triple the rate of generating tokens when using Nvidia GPUs.

One of the problems in creating large language models (LLMs) for tools and apps that offer AI-based functionality, such as Apple Intelligence, is inefficiencies in producing the LLMs in the first place. Training models for machine learning is a resource-intensive and slow process, which is often countered by buying more hardware and taking on increased energy costs.

Earlier in 2024, Apple published and open-sourced Recurrent Drafter, known as ReDrafter, a method of speculative decoding to improve performance in training. It used an RNN (Recurrent Neural Network) draft model combining beam search with dynamic tree attention for predicting and verifying draft tokens from multiple paths.

What if robots could work together like ants to move objects, clear blockages, and guide living creatures? Discover more!

Scientists at Hanyang University in Seoul, South Korea, have developed small magnetic robots that work together in swarms to perform complex tasks, such as moving and lifting objects much more significant than themselves. These microrobot swarms, controlled by a rotating magnetic field, can be used in challenging environments, offering solutions for tasks like minimally invasive treatments for clogged arteries and guiding small organisms.

The researchers tested how microrobot swarms with different configurations performed various tasks. They discovered that swarms with a high aspect ratio could climb obstacles five times higher than a single robot’s body length and throw themselves over them. In another demonstration, a swarm of 1,000 microrobots formed a raft on water, surrounding a pill 2,000 times heavier than a single robot, allowing the swarm to transport the drug through the liquid. On land, a swarm moved cargo 350 times heavier than each robot, while another swarm unclogged tubes resembling blocked blood vessels. Using spinning and orbital dragging motions, the team also developed a system where robot swarms could guide the movements of small organisms.

Researchers at the Tokyo-based startup Sakana AI have developed a new technique that enables language models to use memory more efficiently, helping enterprises cut the costs of building applications on top of large language models (LLMs) and other Transformer-based models.

The technique, called ‘universal transformer memory,’ uses special neural networks to optimize LLMs to keep bits of information that matter and discard redundant details from their context.

From VentureBeat.

A new artificial intelligence (AI) model has just achieved human-level results on a test designed to measure “general intelligence.”

On December 20, OpenAI’s o3 system scored 85% on the ARC-AGI benchmark, well above the previous AI best score of 55% and on par with the average human score. It also scored well on a very difficult mathematics test.

Creating artificial , or AGI, is the stated goal of all the major AI research labs. At first glance, OpenAI appears to have at least made a significant step towards this goal.

The company behind Oreo cookies has, by its own admission, been quietly creating new flavors using machine learning.

As the Wall Street Journal reports, Mondelez — the processed food behemoth that manufactures Oreos, Chips Ahoy, Clif Bars, and other popular snacks — has developed a new AI tool to dream up new flavors for its brands.

Used in more than 70 of the company’s products, the company says the machine learning tool is different from generative AI tools like ChatGPT and more akin to the drug discovery algorithms used by pharmaceutical companies to find and test new medications rapidly. Thus far the tool, created with the help of the software consultant Fourkind, has created products like the “Gluten Free Golden Oreo” and updated Chips Ahoy’s classic recipe, per the WSJ.