Toggle light / dark theme

A journal of the UK-based Biochemical Society is retracting 25 papers after finding “systematic manipulation of our peer-review and publication processes by multiple individuals,” according to a statement provided to Retraction Watch.

The batch of retractions for Bioscience Reports is “the first time that we have issued this many retractions in one go for articles that we believe to be connected,” managing editor Zara Manwaring said in an email.

As academic publishing grapples with its papermill problem, many firms are retracting articles by the dozens, hundreds, or even thousands after discovering foul play.

IOP Publishing has retracted a total of 350 papers from two different 2021 conference proceedings because an “investigation has uncovered evidence of systematic manipulation of the publication process and considerable citation manipulation.”

The case is just the latest involving the discovery of papers full of gibberish – aka “tortured phrases” – thanks to the work of Guillaume Cabanac, a computer scientist at the University of Toulouse, Cyril Labbé, of University Grenoble-Alpes and Alexander Magazinov, of Skoltech, in Moscow. The tool detects papers that contain phrases that appear to have been translated from English into another language, and then back into English, likely with the involvement of paper-generating software.

The papers were in the Journal of Physics: Conference Series (232 articles), and IOP Conference Series: Materials Science and Engineering (118 articles), plus four editorials.

Sometimes leaving well-enough alone is the best policy. Ask Teja Santosh Dandibhotla.

Upset that a paper of his had been retracted from the Journal of Physics: Conference Series, Santosh, a computer scientist at the CVR College of Engineering in Hyderabad, India, contacted us to plead his case. (We of course do not make decisions about retractions, we reminded him.)

Santosh’s article, “Intelligent defaulter Prediction using Data Science Process,” had been pulled along with some 350 other papers in two conference proceedings because IOP Publishing had “uncovered evidence of systematic manipulation of the publication process and considerable citation manipulation.”

The team has released the width-pruned version of the model on Hugging Face under the Nvidia Open Model License, which allows for commercial use. This makes it accessible to a wider range of users and developers who can benefit from its efficiency and performance.

“Pruning and classical knowledge distillation is a highly cost-effective method to progressively obtain LLMs [large language models] of smaller size, achieving superior accuracy compared to training from scratch across all domains,” the researchers wrote. “It serves as a more effective and data-efficient approach compared to either synthetic-data-style fine-tuning or pretraining from scratch.”

This work is a reminder of the value and importance of the open-source community to the progress of AI. Pruning and distillation are part of a wider body of research that is enabling companies to optimize and customize LLMs at a fraction of the normal cost. Other notable works in the field include Sakana AI’s evolutionary model-merging algorithm, which makes it possible to assemble parts of different models to combine their strengths without the need for expensive training resources.