Toggle light / dark theme

Get the latest international news and world events from around the world.

Log in for authorized contributors

Modern GPU Programming For MLSys

Machine learning systems sit at the heart of modern AI workloads. In these systems, performance often comes down to the quality of a small number of GPU kernels. Attention kernels, LLM prefill and decode kernels, low-precision block-scaled GEMMs, fused MoE layers, and other large fused kernels all directly shape end-to-end speed in both training and serving.

To make these kernels fast, however, we need more than a list of optimization tricks. Modern GPUs are no longer simple variations of the same old design. Recent architectures introduce richer memory spaces, new access patterns, and increasingly specialized execution units. To program them well, we need both a clear mental model of the hardware and a practical understanding of how high-performance kernels are built. This book is about developing both.

The book follows a simple progression: first understand the GPU hardware, then learn the programming model we will use, and finally build state-of-the-art kernels step by step. Our main target is the Blackwell generation, and our main running examples are General Matrix-Matrix Multiplication (GEMM) and FlashAttention. Along the way, we will also study the core ingredients behind GPU optimization: data layout, asynchronous data movement, and asynchronous coordination.

Zebrafish brains reveal alternate route for senses to the forebrain shared with mammals

Line up the brains of a fish, bird and a mammal, and something unexpected comes up. You do not see three different answers to the problem of making sense of the world. You see one answer, tilted three different ways. “You can really see it’s almost like a continuum,” says Emre Yaksi, a professor at the Kavli Institute for Systems Neuroscience in Trondheim.

Read across decades of anatomy, the same two ancient pathways carry the world into the forebrain of all these animals. What changes from one to the next is mainly which route does more of the work. Evolution built these brains from different parts, in creatures that parted ways hundreds of millions of years ago. It kept arriving at the same answer anyway.

That is the puzzle the Yaksi lab set out to chase. If animals this far apart on the tree of life keep landing on the same arrangement, perhaps the arrangement is no accident. Perhaps there are organizational rules deep enough that a fish and a person, for all the differences between them, are bound by the same ones.

New Insights into HIV Life Cycle, Th1/Th2 Shift during HIV Infection and Preferential Virus Infection of Th2 Cells: Implications of Early HIV Treatment Initiation and Care

The theory of immune regulation involves a homeostatic balance between T-helper 1 (Th1) and T-helper 2 (Th2) responses. The Th1 and Th2 theories were introduced in 1986 as a result of studies in mice, whereby T-helper cell subsets were found to direct different immune response pathways. Subsequently, this hypothesis was extended to human immunity, with Th1 cells mediating cellular immunity to fight intracellular pathogens, while Th2 cells mediated humoral immunity to fight extracellular pathogens. Several disease conditions were later found to tilt the balance between Th1 and Th2 immune response pathways, including HIV infection, but the exact mechanism for the shift from Th1 to Th2 cells was poorly understood. This review provides new insights into the molecular biology of HIV, wherein the HIV life cycle is discussed in detail.

Scientists Turned Human Cells into Tiny Biological Computers

The researchers also built in a warning signal. When the cell received a confusing instruction—the biological equivalent of two commands arriving at once—it produced a separate alert instead of continuing as if nothing had happened.

To show how the system might one day be used in medicine, the team programmed cells to secrete IL-15, an immune protein that can help activate cancer-fighting immune cells.

The experiments relied on engineered circuits delivered into cells under controlled lab conditions. The authors note several challenges ahead, including avoiding unwanted RNA interactions, limiting leaky genetic switches, and finding reliable ways to insert larger circuits into cell genomes.

Ultrasound-based approach may reduce harmful inflammation and support joint healing

As an aging population experiences joint pain and inflammation at an all-time high, researchers at The University of Alabama in Huntsville (UAH), a part of The University of Alabama System, have published new findings suggesting continuous low-intensity ultrasound may help shift the body’s immune response from prolonged inflammation toward tissue repair, a discovery that could eventually contribute to novel treatments for joint injuries and post-traumatic osteoarthritis.

The study, published in Scientific Reports, was conducted by a multidisciplinary team of UAH researchers under the leadership of Dr. Anuradha Subramanian, a professor of chemical and materials engineering.

The work brought together biological experimentation conducted by Dr. Shahid Khan as part of his doctoral work with computational and statistical methods developed by Dr. Satyaki Roy, a professor of mathematical sciences, along with additional contributions from graduate student Owen Trippany.

The Singularity Is a Story, Not a Prison: Philosophy Portal Interview

After 300+ interviews on Singularity. FM, I ended up on the other side of the microphone.

Cadell Last invited me to Philosophy Portal and asked the questions that go all the way down. How a Bulgarian army nickname became “Socrates,” and why it started as an insult. How 300 resumes and one failed job interview accidentally started Singularity Weblog. And why, after 17 years of studying the technological singularity, I believe its biggest prophets got the most important thing wrong.

Ray Kurzweil is a genius and a genuinely humble human being. I’ve interviewed him and spent hours in his office. But his six epochs of the singularity converge into a single storyline where the universe literally wakes up. That is creationism in scientific clothing. It promises the same heaven of immortality and abundance, and it treats humanity as the chosen species.

Silicon Valley’s version is no better: the march of technology is inevitable, unstoppable, and there is nothing you can do about it.

That is not a prediction. That is a prison.

I grew up behind the Iron Curtain in Bulgaria. I watched the same technology build socialism in the East, democracy in the West, and fascism before both. The big choices are never technological. They are ethical, which is to say political.

The Rosetta Manifold: How AI Erased the Boundary Between Human Thought and Machine Syntax

The barrier between human thought and machine code is officially gone. 🤯

In my last deep dive, we explored “Vibecoding” and how creators are bypassing traditional development bottlenecks using pure vision. But how does AI actually turn your spoken intent into architecture?

AI doesn’t just use a massive translation dictionary. Instead, it operates in a hidden mathematical geometry known as the Latent Space.

In this invisible architecture, an English phrase and a complex Python script are mapped into the exact same coordinate of pure logic. This triggers a massive paradigm shift called Decision Compression—completely erasing the buggy, high-friction “Telephone Game” of traditional software development by binding your raw idea directly to execution.

If AI completely bypasses the need for manual translation, what happens to traditional coding syntax like Java or C++?

And more importantly, who becomes the ultimate builder in this new paradigm?

Read the full deep dive into the engine of the AI revolution!

How giant tropical trees transport water 70 meters to stay as drought-resilient as smaller trees

The giant trees of tropical forests are important allies in the fight against climate change because of their ability to store carbon, yet they are still poorly understood by science. However, a study published in the journal Science reveals a crucial survival mechanism: These trees, which exceed 70 meters (230 feet) in height, have no difficulty transporting water to their tops and are no more vulnerable than smaller trees.

They have developed internal adaptations that compensate for the challenges of transporting water to the highest branches. Furthermore, tests conducted during severe droughts showed that they did not experience a more pronounced decline in growth than smaller trees. This contradicts the hypothesis that very tall trees would be more susceptible to water stress.

/* */