AIResearch AIResearch
Back to articles
Science

AI Solves Unsolved Reasoning Problems

A new automated code generation system cracked 22 previously unsolved abstract reasoning challenges, demonstrating AI's growing ability to tackle complex problems without human guidance.

AI Research
November 14, 2025
3 min read
AI Solves Unsolved Reasoning Problems

Artificial intelligence has reached a milestone in abstract reasoning, with researchers developing a system that automatically writes code to solve problems humans couldn't crack. This breakthrough matters because it shows AI can independently discover solutions to complex challenges, moving beyond simply recognizing patterns to actually creating new approaches. The system, called Jazz, successfully solved 22 previously unsolved problems from the Abstract Reasoning Challenge (ARC), one of AI's toughest benchmarks for measuring true intelligence.

The key finding is that automated code generation can tackle abstract reasoning problems by searching through possible solutions without human intervention. Researchers discovered their system could automatically write code snippets that solved ARC problems requiring pattern recognition and transformation. These weren't just variations of known solutions—the system found completely new approaches to problems that had stumped both humans and other AI systems.

The methodology used a formal framework that treats code generation as a search problem. Think of it like a chef trying to create a new recipe by systematically combining ingredients rather than following existing cookbooks. The system uses Monte-Carlo Tree Search, an algorithm that explores possible code combinations while balancing between trying new approaches and sticking with what works. It builds code from basic operations called primitives, gradually assembling them into working programs through trial and error.

The results show impressive performance despite limited resources. In experiments totaling 70 hours of runtime, the system solved 56 problems across multiple runs, with 22 unique unsolved problems successfully cracked. The researchers note this was achieved with a relatively small set of 150 basic operations and a single-threaded Python implementation, suggesting room for significant improvement with better hardware and optimization. The system demonstrated it could solve at least 131 out of 800 test problems with its current capabilities.

This breakthrough has real-world implications for how we develop AI systems. Instead of requiring programmers to write code for every new problem, this approach could let AI systems automatically generate solutions for data analysis, scientific discovery, and complex decision-making tasks. It's like having an AI research assistant that can not only follow instructions but actually invent new methods. The technology could eventually help automate software development, scientific research, and problem-solving in fields where human expertise is scarce.

However, the system has limitations. It currently requires careful design of the basic operations available to it, and performance depends heavily on the quality of these building blocks. The search process can be computationally expensive, and the system doesn't always find the most elegant or efficient solutions. The researchers acknowledge that extending the system to new domains requires manual work to define appropriate operations and evaluation functions. There's also the challenge of scaling the approach to more complex problems without becoming prohibitively slow.

Original Source

Read the complete research paper

View on arXiv

About the Author

Guilherme A.

Guilherme A.

Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.

Connect on LinkedIn