AIResearch AIResearch
Back to articles
Coding

AI Scientists Now Handle Entire Research Workflows

AI scientists now handle entire research workflows, from idea to publication, speeding up discoveries. This shift could transform science by automating complex processes with minimal human help.

AI Research
November 14, 2025
3 min read
AI Scientists Now Handle Entire Research Workflows

Artificial intelligence systems can now conduct complete scientific research cycles—from literature review to paper publication—with minimal human intervention. This emerging field of 'AI Scientists' represents a fundamental shift from AI as a research tool to AI as an independent researcher, potentially accelerating discovery across multiple scientific domains.

The researchers identified a clear evolutionary pattern in AI Scientist development through their analysis of dozens of systems from 2022 to late 2025. The field has progressed through three distinct phases: Foundational Modules (2022-2023), where systems focused on automating individual scientific tasks; Closed-Loop Integration (2024), where systems began connecting multiple stages into continuous workflows; and the current Frontier phase (2025-present), characterized by efforts toward scalability, impact, and human-AI collaboration.

The study introduces a unified six-stage framework that deconstructs the scientific process into: Literature Review, Idea Generation, Experimental Preparation, Experimental Execution, Scientific Writing, and Paper Generation. This framework provides a common vocabulary for comparing different AI Scientist architectures and capabilities. As shown in the comprehensive capability matrix, early systems like DS-1000 (2023) focused primarily on experimental preparation, while recent systems like DeepScientist (2025) demonstrate capabilities across all six stages.

Modern AI Scientists operate through integrated workflows where each stage feeds into the next. The literature review stage transforms research papers into structured, machine-interpretable knowledge. The idea generation stage then uses this knowledge to formulate testable hypotheses. Experimental preparation translates these hypotheses into executable plans, while experimental execution involves running actual or simulated experiments. The final stages handle scientific writing and paper generation, producing publication-ready manuscripts complete with figures, tables, and citations.

Key systems demonstrate the field's progression. AI-Scientist v1 (2024) was among the first to demonstrate a fully autonomous research loop, while AI-Scientist v2 (2025) improved on this with tree-search planning that allows dynamic exploration of multiple research paths. Domain-specific systems like Coscientist (2023) in chemistry and BioPlanner (2023) in biology show how these approaches apply to concrete scientific problems. The analysis reveals that current systems are pursuing three parallel thrusts: scalability through reinforcement learning in web environments, impact through goal-oriented research that aims to surpass state-of-the-art results, and collaboration through frameworks that enable continuous human-AI partnership.

The practical implications are significant for scientific research. Systems like DeepResearcher use reinforcement learning to handle noisy, real-world information, improving over time. Others like Freephdlabor architect research processes for continual human partnership, where researchers guide, customize, and collaborate with AI systems as personalized team members. This suggests a future where AI Scientists augment human creativity and scale while maintaining the essential human role in scientific direction and validation.

However, the study identifies several limitations that current systems face. Reproducibility remains challenging in complex multi-stage workflows where minor variations can lead to divergent outcomes. Systems often produce outputs with confidence that masks underlying uncertainties and alternative hypotheses. Cross-domain generalization is limited, with systems performing well in structured domains like chemistry but struggling in fields with less standardized procedures. These limitations point to the need for reproducibility-by-design architectures, better uncertainty reasoning, and more modular, composable systems that can transfer capabilities across domains.

About the Author

Guilherme A.

Guilherme A.

Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.

Connect on LinkedIn