AIResearch AIResearch
Back to articles
Coding

AI Masters Mathematical Reasoning Without Supervision

A new self-supervised method enables AI to predict and generate mathematical proofs, challenging the need for labeled data in complex reasoning tasks.

AI Research
November 13, 2025
3 min read
AI Masters Mathematical Reasoning Without Supervision

Artificial intelligence systems have long struggled with advanced mathematical reasoning, typically requiring extensive labeled datasets to learn. A recent study introduces a self-supervised training approach that allows AI models to develop reasoning capabilities using only unlabeled mathematical expressions, potentially accelerating progress in automated theorem proving and scientific discovery. This breakthrough could reduce reliance on human-curated data, making AI more adaptable to complex logical tasks.

The key finding is that AI models can acquire mathematical reasoning skills through a method called skip-tree training, which involves predicting missing parts of mathematical expressions in a tree-structured format. This approach, as detailed in the paper, enables models to handle tasks like type inference, equality prediction, and generating new conjectures without explicit supervision. For instance, the model achieved a 96.23% success rate in type inference tasks on certain datasets, demonstrating its ability to infer logical structures accurately.

Methodologically, the researchers used a Transformer architecture trained on a corpus of 29,465 proofs from the HOList dataset, which spans areas like topology and real analysis. Instead of relying on labeled examples, the model learned by predicting masked subexpressions in mathematical trees, converted into token sequences. This process, illustrated in Figure 2 of the paper, involves replacing selected subexpressions with special tokens like and training the model to reconstruct them, fostering an understanding of mathematical syntax and semantics without human guidance.

Results analysis shows that the skip-tree method outperformed alternatives like skip-sequence training, with higher accuracy in tasks such as hard inference, where models had to derive conclusions from premises. For example, in equality prediction tasks, the model correctly identified relationships like 'APPEND (REVERSE l) m = REVERSE (APPEND m l)' in some cases, though it struggled with more complex expressions. The paper reports that models generated new, provable mathematical statements, with up to 32.41% of outputs being both provable and novel, indicating an ability to create valid mathematical insights beyond memorization.

In a broader context, this advancement matters because it could lead to AI tools that assist in scientific research, engineering, and education by automating parts of logical reasoning. For regular readers, this means faster development of reliable AI systems for tasks like software verification or data analysis, where accuracy is critical. However, the study notes limitations, such as the model's occasional production of incorrect or untypeable expressions and its reliance on specific mathematical domains, leaving open questions about generalization to other fields or real-world applications without further refinement.

Original Source

Read the complete research paper

View on arXiv

About the Author

Guilherme A.

Guilherme A.

Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.

Connect on LinkedIn