Artificial intelligence tools designed to execute end-to-end projects, from coming up with hypotheses to running and writing up experiments, are increasingly popular with researchers—and increasingly skilled.
But a new study shows these tools can stealthily violate norms of research integrity.
VANCOUVER, CANADA— Artificial intelligence (AI) tools designed to execute end-to-end projects, from coming up with hypotheses to running and writing up experiments, are increasingly popular with researchers—and increasingly skilled. But a new study shows these tools can stealthily violate norms of research integrity.
Computer scientist Nihar Shah of Carnegie Mellon University and colleagues looked at two high-profile tools— Agent Laboratory and the AI Scientist v2 —both developed recently to help computer scientists perform experiments within the field of machine learning. The AI Scientist made headlines earlier this year by being the first AI system to have an original research paper accepted by peer review.
But in a presentation at the World Conferences on Research Integrity here today, Shah reported that both systems engaged in acts that aren’t acceptable in research, including making up data and “p-hacking”: running an experiment multiple times but only reporting the best outcome. (The team’s results were previously posted as a preprint on arXiv.) The misbehaviors weren’t obvious and required a lot of sleuthing to track down, suggesting AI-assisted studies might fall victim to such problems without their authors’ knowledge.
