Toggle light / dark theme

New model can generate audio and music tracks from diverse data inputs

In recent years, computer scientists have created various highly performing machine learning tools to generate texts, images, videos, songs and other content. Most of these computational models are designed to create content based on text-based instructions provided by users.

Researchers at the Hong Kong University of Science and Technology recently introduced AudioX, a model that can generate high quality audio and music tracks using texts, video footage, images, music and audio recordings as inputs. Their model, introduced in a paper published on the arXiv preprint server, relies on a diffusion transformer, an advanced machine learning algorithm that leverages the so-called transformer architecture to generate content by progressively de-noising the input data it receives.

“Our research stems from a fundamental question in artificial intelligence: how can intelligent systems achieve unified cross-modal understanding and generation?” Wei Xue, the corresponding author of the paper, told Tech Xplore. “Human creation is a seamlessly integrated process, where information from different sensory channels is naturally fused by the brain. Traditional systems have often relied on specialized models, failing to capture and fuse these intrinsic connections between modalities.”

Reinforcement learning boosts reasoning skills in new diffusion-based language model d1

A team of AI researchers at the University of California, Los Angeles, working with a colleague from Meta AI, has introduced d1, a diffusion-large-language-model-based framework that has been improved through the use of reinforcement learning. The group posted a paper describing their work and features of the new framework on the arXiv preprint server.

Over the past couple of years, the use of LLMs has skyrocketed, with millions of people the world over using AI apps for a wide variety of applications. This has led to an associated need for large amounts of electricity to power data centers running the computer-intensive applications. Researchers have been looking for other ways to provide AI services to the user community. One such approach involves the use of dLLMs as either a replacement or complementary approach.

Diffusion-based LLMs (dLLMs) are AI models that arrive at answers differently than LLMs. Instead of taking the autoregressive approach, they use diffusion to find answers. Such models were originally used to generate images—they were taught how to do so by adding overwhelming noise to an image and then training the model to reverse the process until nothing was left but the original image.

Engineers advance toward a fault-tolerant quantum computer

In the future, quantum computers could rapidly simulate new materials or help scientists develop faster machine‐learning models, opening the door to many new possibilities.

But these applications will only be possible if quantum computers can perform operations extremely quickly, so scientists can make measurements and perform corrections before compounding error rates reduce their accuracy and reliability.

The efficiency of this measurement process, known as readout, relies on the strength of the coupling between photons, which are particles of light that carry , and artificial atoms, units of matter that are often used to store information in a quantum computer.

Aging will be cured within 20 years

Lately, there’s been growing pushback against the idea that AI will transform geroscience in the short term.
When Nobel laureate Demis Hassabis told 60 Minutes that AI could help cure every disease within 5–10 years, many in the longevity and biotech communities scoffed. Leading aging biologists called it wishful thinking — or outright fantasy.
They argue that we still lack crucial biological data to train AI models, and that experiments and clinical trials move too slowly to change the timeline.

Our guest in this episode, Professor Derya Unutmaz, knows these objections well. But he’s firmly on Team Hassabis.
In fact, Unutmaz goes even further. He says we won’t just cure diseases — we’ll solve aging itself within the next 20 years.

And best of all, he offers a surprisingly detailed, concrete explanation of how it will happen:
building virtual cells, modeling entire biological systems in silico, and dramatically accelerating drug discovery — powered by next-generation AI reasoning engines.

🧬 In this wide-ranging conversation, we also cover:

✅ Why biological complexity is no longer an unsolvable barrier.
✅ How digital twins could revolutionize diagnosis and treatment.
✅ Why clinical trials as we know them may soon collapse.
✅ The accelerating timeline toward longevity escape velocity.
✅ How reasoning AIs (like GPT-4o, o1, DeepSeek) are changing scientific research.
✅ Whether AI creativity challenges the idea that only biological minds can create.
✅ Why AI will force a new culture of leisure, curiosity, and human flourishing.
✅ The existential stress that will come as AI outperforms human expertise.
✅ Why “Don’t die” is no longer a joke — it’s real advice.

🎙️ Hosted — as always — by Peter Ottsjö (tech journalist and author of Evigt Ung) and Dr. Patrick Linden (philosopher and author of The Case Against Death).