AIResearchAIResearch
Machine Learning

OpenAI Cracks 1946 Erdős Conjecture, Reshapes Math AI Race

OpenAI solved an 80-year-old Erdős conjecture, sending Anthropic's Math AI leadership odds from 77% to 24% in a week and reshaping frontier model competition.

3 min read
OpenAI Cracks 1946 Erdős Conjecture, Reshapes Math AI Race

TL;DR

OpenAI solved an 80-year-old Erdős conjecture, sending Anthropic's Math AI leadership odds from 77% to 24% in a week and reshaping frontier model competition.

A mathematical conjecture that resisted proof for eighty years fell last week to an AI system. OpenAI's general-purpose reasoning model independently solved the Erdős planar unit distance problem, a combinatorics question posed in 1946 that concerns how often the same distance can appear among points arranged on a plane. The model produced the solution without domain-specific fine-tuning, according to information shared by OpenAI on social media.

The market read the signal quickly. Data from Crypto Briefing shows that prediction market odds for Anthropic holding the title of best Math AI model by May 2026 fell from 77% to 24% within a single week. Simultaneously, the probability of Anthropic finishing second climbed from 82% to 97%, suggesting traders see this as a reordering of rank, not an elimination from the race.

The benchmark shift

For practitioners tracking capability through leaderboard scores, a solved open conjecture is a different category of evidence. Benchmarks like GSM8K and MATH measure performance on problems with known solutions, where a model can score well by retrieving or approximating patterns from training data. The planar unit distance problem was genuinely open: no verified proof existed before last week. Producing a correct argument required constructing something novel, which is a harder claim to fake.

That distinction carries practical weight for applied researchers and engineers. If you are building systems that reason about novel configurations in combinatorial optimization, formal verification, or constraint satisfaction, a model capable of original mathematical inference is qualitatively more valuable than one that tops a fixed test suite. LLM Stats currently indexes over 300 model releases, but evaluations of genuine research-level reasoning remain rare enough that a result like this lands with real weight.

Anthropic's parallel track

While OpenAI captured the math headline, Anthropic has been running a quieter race. In April, PBS NewsHour reported that the company began limited access testing of Mythos, a model it describes as potentially disruptive enough to warrant withholding from public release. More than 40 technology companies, including direct competitors, received access to probe the system for vulnerabilities. The capability drawing the most concern is Mythos's ability to identify exploitable software bugs at a speed and scale that outpaces what a human security researcher can accomplish in a full working day.

These two developments sit in deliberate tension. OpenAI's claim is about mathematical generality emerging from a broad reasoning model. Anthropic's claim is about depth in a narrow, high-stakes domain where general release carries measurable risk. The prediction market data captures this numerically: Anthropic is near-certain to place second in Math AI, while the top position now looks like OpenAI's to defend.

What it means for practitioners

Neither result is directly deployable today. The Erdős breakthrough has not shipped as a product feature, and Mythos remains in controlled testing with a restricted partner group. What both results do is signal where frontier labs are concentrating their reasoning investments, which matters for any team making API strategy decisions or planning model selection over the next two to four quarters.

Model release velocity makes that planning harder. Price Per Token shows GPT-5.5, DeepSeek V4 Pro, Grok 4.3, and Gemini 3.5 Flash all arriving within weeks of each other in late April and early May 2026. In a market moving at this pace, a single research result can shift competitive positioning sharply. A 53-point drop in prediction market odds within seven days is the clearest measure of that. Teams evaluating artificial intelligence platforms for math-heavy workloads have a cleaner directional signal now than they did a week ago.

Whether solving an Erdős conjecture generalizes to the constrained, noisy, real-world reasoning that practitioners encounter in production is still an open question. That is the proof worth watching for.

FAQ

Q: What is the Erdős planar unit distance problem?
A: A combinatorics problem posed by Paul Erdős in 1946 asking how many times the same distance can occur between pairs of points in a set of n points on a plane. It had remained unresolved until OpenAI's model produced an independent solution last week.

Q: How do prediction markets measure AI model quality?
A: They aggregate crowd estimates into continuously updated probabilities. The Math AI market tracked by Crypto Briefing prices the likelihood that a given lab holds the top position by a set date, adjusting in real time as new evidence arrives.

Q: What is Anthropic's Mythos model?
A: Mythos is a frontier model in limited access testing. Anthropic is sharing it with select partners rather than releasing it publicly, citing concerns about its capacity to accelerate offensive security research at scale.

Q: How does a solved conjecture differ from a benchmark result?
A: Benchmarks test known problems with ground-truth answers, so high scores can reflect memorization or pattern-matching. A solved open conjecture had no correct answer in any training corpus, making it a stronger signal of generative reasoning rather than retrieval.

About the Author

Guilherme A.

Guilherme A.

Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.

Connect on LinkedIn