AI-generated science is now "good enough"
The "AI Scientist" is mediocre and uninspired. Don't dismiss it.
AI can now write halfway decent research papers about AI.
Last week, Nature published a study on an "AI Scientist" that automates research. The tool can come up with the idea for a machine learning study, run the experiments, review its own work, and draft up a paper that human reviewers thought was okay — (almost) without human help.
Does this mean that AI will replace scientists tomorrow? No. The AI is a mediocre scientist; its best paper was too low quality to be accepted to a normal conference. But it was good enough for workshop with a 70% acceptance rate.
And good enough is enough to matter.
What happened
The "AI Scientist" is a tool that fully automates research you can do on a computer. Provided with a simple starting template, it can come up with its own ideas and identify which ones are most novel, code and run experiments, review literature, and draft papers based on the notes it takes along the way. At the end, an "automated reviewer" trained to mimic human peer review evaluates the work.
Under the hood, the AI Scientist is roughly an elaborate, programmatic flow-chart and set of prompts that direct multiple AI "agents" — instances of large language models (LLMs) like Claude or GPT-4o — through the research process.
Researchers submitted three papers generated by the AI Scientist to a workshop of the International Conference on Learning Representations (ICLR) with the consent of the organizers. Human reviewers were told that some of the papers submitted were AI, but weren't told which ones. One of the three submitted papers was deemed good enough to be accepted to the workshop, which received 43 total submissions and had an acceptance rate of 70%.
The researchers are pretty clear about the AI-generated papers being uninspired and mediocre. But they also claim that more powerful models produced better papers — which they argue means AI "scientists" should only get better in the future.
Who did it
The study came from a team of machine learning researchers with overlapping ties to industry and academic research groups at Oxford, the University of British Colombia, and the Vector Institute, a Canadian AI nonprofit). Six of the eight authors, including all four lead authors, work for the Japanese AI company Sakana AI.
Look. You and I both know that if I were a barista, you'd 100% be leaving a tip right now. The social pressure to drop a coin in that jar would just be overwhelming. But you're presumably reading this thing alone on a universe-rectangle, so I can't rely on my mere presence to remind you to be nice.
If you made it this far I assume you're enjoying the article. So maybe drop a coin in my digital tip jar. It's just €2 per month (€0.25 per post!) — a cheap and fully automated weekly warm fuzzy feeling for the digital age.
Why it matters
This paper has been in the works for two years now, and finally made it through peer review. That, to me, marks a milestone. Even if the basic idea for the AI Scientist was in place years ago — and there are now plenty of AI research assistants available — we are now at a point where one of the most esteemed scientific journals in the world thinks it is worth devoting precious page-space to a paper describing how (AI) research can be done entirely by machines. And this happened less than 10 years after the first LLM.
Fully automated science is, right now, not very good. Being good is not the point. The point is that is starting to be good enough — and good enough, paired with way cheaper, tends to dominate and push the slower, expensive, but qualitatively better predecessor into a smaller niche (if it isn't killed outright). Mass produced fashion exiled tailoring to a handful of luxury and hobby refugia. Now extend that metaphor to science.
To me, this study feels like a peek into a near-term future that I am not very excited to live through — even though I do believe that, in the long term, humans gradually "domesticate" disruptive technologies, maximizing their benefits while minimizing their downsides. But it takes a while. We are still not done doing this for fire or agriculture. We have only just begun to do this for the internet and phones. The transition often isn't pretty.
While the research team acknowledges just how disruptive — and even dangerous — automating science could be, they clearly think it'll be worth the cost. They end with a perspective on the AI Scientist that I assume is supposed to be optimistic: "It signals the dawn of a new era in which the process of discovery is no longer a solely human pursuit and in which the pace at which we are able to reap the harvest of scientific discovery could accelerate dramatically."
For what it's worth, I believe that they're right that human lives could be improved by accelerating certain kinds of science using AI — brute forcing the early stages of drug discovery, for instance, is absolutely something we should do. If you're sick, you don't care if your doctor can pull up 200 research papers about how a life-saving medicine works so long as she knows that it works.
But there's a two-letter word tucked into the last point that deserves unpacking: "we."
Who is this "we" that will reap the harvest of AI-driven scientific discovery? One assumes that the researchers behind this paper — most of whom who work at an AI company aiming to monetize this tech — are confident that they'll be among the "we."
I do wonder about the rest of us.
Thanks for reading
There are many ways you can help:
- Subscribe, if you haven't already!
- Share this post on Bluesky, Twitter/X, LinkedIn, Facebook, or wherever else you hang out online.
- Become a patron for the price of 1 cappuccino per month
- Drop a few bucks in my tip jar
- Send recommendations for research to feature in my monthly paper roundups to elise@reviewertoo.com with the subject line "Paper Roundup Recommendation"
- Tell me about your research for a Q&A post (email enquiries to elise@reviewertoo.com)
- Follow me on Bluesky
- Spread the word!
