TL;DR
DeepMind's AlphaProof Nexus pairs LLMs with formal verification to solve Erdős problems for hundreds of dollars, signaling a shift in AI-driven mathematical research.
Nine problems from Paul Erdős's open catalog just fell to a machine. Google DeepMind's AlphaProof Nexus, documented in an arXiv preprint published May 21, solved 9 of 353 open Erdős problems and proved 44 of 492 open conjectures from the Online Encyclopedia of Integer Sequences - problems that in some cases resisted human mathematicians for longer than most practitioners have been alive.
The cost per formally verified proof: a few hundred dollars.
AlphaProof Nexus pairs large language models with Lean, a formal proof assistant, to address a core failure mode in AI-generated mathematics. When the model proposes a proof, a verification layer checks every logical step against Lean's formal system. Invalid inferences get rejected outright. As Crypto Briefing reported, this architecture directly attacks the hallucination problem: the system cannot slip an invalid inference past the checker, regardless of how plausible it reads in natural language.
The results
The system uses what DeepMind calls "agentic loops" - the model generates a candidate proof, submits it to the formal checker, receives structured feedback on failure points, and iterates. It's adversarial search through proof space, not one-shot generation. Solved examples include Erdős catalog variants #125, #138, #741, and #12. Alongside the preprint, DeepMind released formal proofs and natural-language versions on GitHub.
One nuance worth noting: a simpler baseline agent also solved 9 Erdős problems, matching the full system's count but at higher cost. The agentic loops' primary advantage may be economic efficiency rather than raw capability expansion at the frontier.
The 44 OEIS conjecture proofs add meaningful breadth. That catalog tracks nearly half a million integer sequences, and its open conjectures represent accumulated unsolved problems in combinatorics and number theory spanning decades of community effort.
What this means for practitioners
The deeper implication isn't about these specific problems. It's about what happens when artificial intelligence research produces outputs that can be formally verified rather than editorially assessed. Debates about powerful AI systems - documented by PBS NewsHour and others - often center on the opacity of model outputs. Formal proof systems solve that problem for mathematics by making every logical step inspectable and machine-checkable.
For ML engineers and applied scientists, the near-term question is whether architectures like this can reach mathematical claims embedded in research papers: convergence proofs, complexity bounds, lemma chains. That verification currently requires expert review time. At a few hundred dollars per proof, the economics could work for serious research groups. The Price Per Token tracker shows inference costs falling broadly across the model landscape; the same trajectory applied to formal verification could make this approach routine within a few years.
The unresolved question is difficulty distribution. Erdős problems range from graduate-student-accessible to decades-resistant. Without a breakdown of where the 9 solved problems fall on that spectrum, the headline number is exciting but incompletely characterized. As BetaKit noted when covering Cohere's open-source release, transparency of method matters as much as headline performance - and the GitHub release of formal proofs at least makes outputs auditable, even if the training process behind them is not.
Whether AlphaProof Nexus reached the hard end of the Erdős catalog will determine how significant this milestone actually is.
Frequently Asked Questions
What is AlphaProof Nexus?
Google DeepMind's system combining large language models with the Lean formal proof assistant. The model generates proof candidates; Lean verifies or rejects each logical step. Agentic loops let the system iterate on failures until a proof passes or the problem is abandoned.
What are Erdős problems?
Open mathematical conjectures posed by Hungarian mathematician Paul Erdős, cataloged in a list of 353 unsolved problems spanning combinatorics and number theory. Many carried cash prizes offered by Erdős himself, some of which remain unclaimed decades after his death.
How does formal verification prevent AI hallucination in mathematics?
Lean requires every logical inference to satisfy a formal type-theoretic foundation. The checker rejects any step that doesn't follow from axioms, regardless of how plausible the surrounding argument reads in natural language - eliminating the gap between "sounds right" and "is right."
What does this mean for AI-assisted research today?
If the cost structure holds, teams could apply similar architectures to verify mathematical claims in ML papers - convergence arguments, complexity bounds, existence proofs - without requiring manual specialist review of every step. A few hundred dollars per formally verified theorem is a new price point for the field.
About the Author
Guilherme A.
Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.
Connect on LinkedIn