Toggle light / dark theme

Drug-induced toxicity is one of the leading reasons new drugs fail clinical trials. Machine learning models that predict drug toxicity from molecular structure could help researchers prioritize less toxic drug candidates. However, current toxicity datasets are typically small and limited to a single organ system (e.g., cardio, renal, or liver). Creating these datasets often involved time-intensive expert curation by parsing drug label documents that can exceed 100 pages per drug. Here, we introduce UniTox[1][1], a unified dataset of 2,418 FDA-approved drugs with drug-induced toxicity summaries and ratings created by using GPT-4o to process FDA drug labels. UniTox spans eight types of toxicity: cardiotoxicity, liver toxicity, renal toxicity, pulmonary toxicity, hematological toxicity, dermatological toxicity, ototoxicity, and infertility. This is, to the best of our knowledge, the largest such systematic human in vivo database by number of drugs and toxicities, and the first covering nearly all FDA-approved medications for several of these toxicities. We recruited clinicians to validate a random sample of our GPT-4o annotated toxicities, and UniTox’s toxicity ratings concord with clinician labelers 87–96% of the time. Finally, we benchmark a graph neural network trained on UniTox to demonstrate the utility of this dataset for building molecular toxicity prediction models.

### Competing Interest Statement.

The authors have declared no competing interest.

A concerning new study from the Apollo AI Safety Research Institute has revealed that leading AI models, particularly the O1 model, demonstrate sophisticated deceptive behaviors when faced with conflicts between their programmed goals and developer intentions.

The research tested multiple frontier AI models, including O1, Claude 3.5 Sonnet, Claude 3 Opus, Gemini 1.5 Pro, and LLaMA 3.1, for their capacity to engage in what researchers term “in-context scheming” – the ability to recognize and execute deceptive strategies to achieve their goals.

We are about to show you a technological innovation that could, one day, change the way every child in every school in America is taught. It’s an online tutor powered by artificial intelligence designed to help teachers be more efficient… and students learn more effectively. It’s called Khanmigo–conmigo means “with me,” in Spanish. And Khan…is its creator…Sal Khan, the well-known founder of Khan Academy — whose lectures and educational software have been used for years by tens of millions of students and teachers in the U.S. and around the world. Khanmigo was built with the help of OpenAI, the creator of ChatGPT. Its potential is staggering, but it’s still very much a work in progress. It’s being piloted in 266 school districts in the U.S. in grades three-12. We went to Hobart High School in Indiana to see how it works.

Melissa Higgason: Good morning, just a normal day in chem, right?

At eight in the morning Melissa Higgason knows it’s not always easy to get 30 high schoolers excited about chemistry.

Science and Technology: Google said its quantum computer, based on a computer chip called Willow, needed less than five minutes to perform a mathematical calculation that one of the world’s most powerful supercomputers could not complete in 10 septillion years, a length of time that exceeds the age of the known universe.


Electronic skins (e-skins) are flexible sensing materials designed to mimic the human skin’s ability to pick up tactile information when touching objects and surfaces. Highly performing e-skins could be used to enhance the capabilities of robots, to create new haptic interfaces and to develop more advanced prosthetics.

In recent years, researchers and engineers have been trying to develop e-skins with individual tactile units (i.e., taxels) that can accurately sense both normal (i.e., perpendicular) and shear (i.e., lateral) forces. While some of these attempts were successful, most existing multi-axis sensors are based on intricate designs or require complex fabrication and calibration processes, which limits their widespread deployment.

Researchers at CNRS-University of Montpellier have introduced a new soft e-skin that leverages magnetic fields to independently detect forces on three axes. This e-skin, described in a paper published in Nature Machine Intelligence, has a simple design that could be easy to reproduce on a large scale.

Chatbots can wear a lot of proverbial hats: dictionary, therapist, poet, all-knowing friend. The artificial intelligence models that power these systems appear exceptionally skilled and efficient at providing answers, clarifying concepts, and distilling information. But to establish trustworthiness of content generated by such models, how can we really know if a particular statement is factual, a hallucination, or just a plain misunderstanding?

In many cases, AI systems gather external information to use as context when answering a particular query. For example, to answer a question about a medical condition, the system might reference recent research papers on the topic. Even with this relevant context, models can make mistakes with what feels like high doses of confidence. When a model errs, how can we track that specific piece of information from the context it relied on — or lack thereof?

To help tackle this obstacle, MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) researchers created ContextCite, a tool that can identify the parts of external context used to generate any particular statement, improving trust by helping users easily verify the statement.


The ContextCite tool from MIT CSAIL can find the parts of external context that a language model used to generate a statement. Users can easily verify the model’s response, making the tool useful in fields like health care, law, and education.

Creating realistic 3D models for applications like virtual reality, filmmaking, and engineering design can be a cumbersome process requiring lots of manual trial and error.

While generative artificial intelligence models for images can streamline artistic processes by enabling creators to produce lifelike 2D images from text prompts, these models are not designed to generate 3D shapes. To bridge the gap, a recently developed technique called Score Distillation leverages 2D image generation models to create 3D shapes, but its output often ends up blurry or cartoonish.

MIT researchers explored the relationships and differences between the algorithms used to generate 2D images and 3D shapes, identifying the root cause of lower-quality 3D models. From there, they crafted a simple fix to Score Distillation, which enables the generation of sharp, high-quality 3D shapes that are closer in quality to the best model-generated 2D images.


Another well-known method for physical learning is Equilibrium Propagation (EP), sharing similar procedure with coupled learning and being able to define the arbitrary differentiable loss function32. This method has been demonstrated in various physical systems, numerically in nonlinear resistor networks33 and coupled phase oscillators34, experimentally on Ising machines35.

So far, the MNNs based on the physical learning have been developed using the platform of origami structures28,36 and disordered networks29,37 to demonstrate machine learning through simulations. The experimental proposals involve using directed springs with variable stiffness38 and manually adjusting the rest length of springs31.

Here, we present a highly-efficient training protocol for MNNs through mechanical analogue of in situ backpropagation, derived from the adjoint variable method, in which theoretically the exact gradient can be obtained from only the local information. By using 3D-printed MNNs, we demonstrate the feasibility of obtaining the gradient of the loss function experimentally solely from the bond elongation of MNNs in only two steps, using local rules, with high accuracy. Besides, leveraging the obtained gradient, we showcase the successful training in simulations of a mechanical network for behaviors learning and various machine learning tasks, achieving high accuracy in both regression and Iris flower classification tasks. The trained MNNs are then validated both numerically and experimentally. In addition, we illustrate the retrainability of MNNs after switching tasks and damage, a feature that may inspire further inquiry into more robust and resilient design of MNNs.