Toggle light / dark theme

V/ Sebastian Raschka.

For weekend reading:

Chapter 6 (Finetuning LLMs for Classification) of Build an LLM from Scratch book is now finally available on the Manning website:


  • Introducing different LLM finetuning approaches
  • Preparing a dataset for text classification
  • Modifying a pretrained LLM for finetuning
  • Finetuning an LLM to identify spam messages
  • Evaluating the accuracy of a finetuned LLM classifier
  • Using a finetuned LLM to classify new data

In previous chapters, we coded the LLM architecture, pretrained it, and learned how to import pretrained weights from an external source, such as OpenAI, into our model. In this chapter, we are reaping the fruits of our labor by finetuning the LLM on a specific target task, such as classifying text, as illustrated in figure 6.1. The concrete example we will examine is classifying text messages as spam or not spam.

“The ORFAN study is an expanding global registry which will include long-term clinical and outcome data for 250,000 patients from around the world, and we are very pleased to publish these initial results,” said Keith Channon, MD, Professor of Cardiovascular Medicine at University of Oxford, Caristo Chief Medical Officer, and co-author of The Lancet publication.

“Coronary inflammation is a crucial piece of the puzzle in predicting heart attack risk. We are excited to discover that CaRi-Heart results performed exceptionally well in predicting patient cardiac events. This tool is well positioned to help clinicians identify high-risk patients with seemingly ‘normal’ CCTA scans.”

Ron Blankstein, MD, Professor of Medicine and Radiology at Harvard Medical School, Director of Cardiac Computed Tomography at Brigham and Women’s Hospital, and co-author of the publication, applauded The Lancet for publishing results from one of the largest studies in the field of CCTA.

NASA is developing a ChatGPT-style interface for future spacecraft, giving astronauts the ability to talk to the systems using natural language — and have the systems talk right back.

Space talk: In June 2018, a massive dust storm on Mars engulfed NASA’s Opportunity rover, cutting off communication with Earth. Eight months later, NASA announced an end to the rover’s 15-year Mars mission.

“The last message [NASA] received was basically, ‘My battery is low and it’s getting dark,’” tweeted science reporter Jacob Margolis after talking to NASA about the rover.

Could autonomous CubeSats someday be used for deep space exploration? This is what a recent demonstration by SpaceX’s Starling swarm of four CubeSats hopes to address as NASA announced the 10-month primary mission of the swarm successfully completed its primary mission objectives as part of multiple experiments. This demonstration holds the potential to help develop CubeSats capable of conducting deep space exploration missions without the need for constant communication with Earth as they could communicate with each other.

“The success of Starling’s initial mission represents a landmark achievement in the development of autonomous networks of small spacecraft,” said Roger Hunter, who is the Program Manager for NASA’s Small Spacecraft Technology Program (STTP) at NASA’s Ames Research Center. “The team has been very successful in achieving our objectives and adapting in the face of challenges.”

For the demonstration, the four CubeSats successfully studied the Earth’s ionosphere while communicating with each other regarding next steps for continued analysis. This provided the opportunity for each spacecraft to equally contribute to the mission while decreasing human involvement.

Humans can innately perform a wide range of movements, as this allows them to best tackle various tasks in their day-to-day life. Automatically reproducing these motions in virtual avatars and 3D animated human-like characters could be highly advantageous for many applications, ranging from metaverse spaces to digital entertainment, AI interfaces and robotics.

Researchers at Max Planck Institute for Intelligent Systems and ETH Zurich recently developed WANDR, a new model that can generate natural human motions for avatars. This model, to be introduced in a paper presented at the Conference on Computer Vision and Pattern Recognition (CVPR 2024) in June, unifies different data sources under a single model to attain more realistic motions in 3D humanoid characters. The paper is also posted to the arXiv preprint server.

“At a high-level, our research aims at figuring out what it takes to create able to behave like us,” Markos Diomataris, first author of the paper, told Tech Xplore. “This essentially means learning to reason about the world, how to move in it, setting goals and trying to achieve them.