Sakana AI scientist-v2 Paper passes peer-review at ICLR workshop | Llm training dataset size | Hacker news chatgpt voice reddit | Large language models tutorial python github | Turtles AI

Sakana AI scientist-v2 Paper passes peer-review at ICLR workshop
AI bypasses peer review process: A significant step in evaluating AI-generated research at scientific conferences
Isabella V12 March 2025

 

A fully AI-generated paper recently passed peer review at a workshop at one of the world’s leading machine learning conferences. The paper, produced by an advanced AI system, sparked discussions about the quality of technology-generated research and its future impact on science. The experiment involved institutions such as the University of British Columbia and the University of Oxford, further advancing the debate about the growing role of AI in science.

Key Points:

  • Peer Review Passed: For the first time, a fully AI-generated paper passed peer review at a workshop at a high-level AI conference.
  • Institutional Collaboration: The project involved a major collaboration with the University of British Columbia and the University of Oxford, supported by the ICLR leadership.
  • Controlled Experiment: The paper was double-blind, with reviewers unaware whether the manuscript was written by a human or an AI.
  • Future Challenges: Despite their success, the generated papers have not met the requirements for publication in conference main tracks, and AI still has limitations in high-level scientific research.

At a major international machine learning conference, AI has passed a significant milestone. For the first time, a scientific paper entirely produced by an advanced version of AI, called AI Scientist-v2, has successfully passed the peer-review process in a dedicated workshop. This manuscript submission was the result of an experiment designed to evaluate how AI-generated papers perform in academic science, sparking debate about the trustworthiness and quality of autonomous machine-generated research. The project has received approval from the University of British Columbia’s Institutional Review Board (IRB) and was developed in collaboration with Oxford Universities and the ICLR conference leadership. The goal of the experiment was to understand how AI can be integrated into the traditional peer review process, challenging current conventions in scientific publishing.

The paper submission process followed a rigorous methodology. Three fully AI-generated papers were submitted to the ICLR workshop for review by reviewers, who did not know whether the papers in their possession were the result of a human author or an artificial system. The key feature of these papers is that every aspect, from the scientific proposal to the writing of the code for the experiments, to the analysis and formatting of the manuscript, was generated autonomously by AI, without any direct human intervention. Only the indication of a general topic by human researchers guided the work of the system, which then developed and concluded the project autonomously.

Of the three papers submitted for review, one received ratings that placed it above the acceptance threshold, with scores ranging from 6 to 7 on a scale that averages around 6. Despite the positive score, the paper was withdrawn before its official publication, in accordance with the rules established for the experiment. This was a necessary step to ensure the transparency of the process, preventing the publication of AI-generated works from creating confusion or misunderstanding in traditional scientific practices.

Assessing the quality of the papers highlighted several challenges. Although the results of the generated papers were considered valid, none of the three papers passed the internal criteria for acceptance into the main track of the ICLR conference, a milestone reserved for more mature and refined works. Acceptance rates for the main conference are in fact much lower than those for the workshops, where preliminary works are presented. In any case, the quality of the submitted papers has sparked discussions on the future developments of AI in science, with the expectation that subsequent versions of AI systems could reach even higher standards.

Internal analysis of the three articles revealed some weaknesses, such as formatting issues or missing citations, but also highlighted the potential of AI in generating original scientific ideas and in its attempts to address complex machine learning issues. Despite this, research continues to be guided by human supervision, with AI acting as a support tool, and not yet a complete replacement for researchers.

The importance of transparency in the scientific publication of AI-generated papers was highlighted by the peer review process, with the agreement not to make the papers public in the OpenReview forum. This approach ensured that AI was not perceived as a mechanism to “cheat” the scientific process, but rather as a tool to explore the future of research. In the future, the scientific community will need to address the question of how to properly treat AI-generated papers, defining precise rules on how to declare their status and role.

Overall, this experiment was an important test phase for the future of AI-generated science, but it also showed that there is still much work to be done. The challenges related to the quality of the work and the reproducibility of the results highlight the need for further technological developments. However, it cannot be denied that AI, even in its early stages, has already demonstrated its ability to produce valid scientific content, pushing research in new directions.

The future of science, with the increasingly significant contribution of AI, appears to be a field of continuous evolution and untapped potential.