Anthropic's Fable 5 hits 80.3% on SWE-bench Pro coding tasks

TL;DR

Claude Fable 5 achieves 80.3% on SWE-bench Pro, outperforming Opus 4.8 by 11 points and GPT-4 by over 20, with a landmark Stripe codebase migration as proof of concept.

Anthropic's Fable 5 scored 80.3% on SWE-bench Pro, clearing Claude Opus 4.8 by eleven percentage points and GPT-4 by more than twenty. SWE-bench Pro tests models on real GitHub issues that require actual code changes, not synthetic puzzles, which makes that margin harder to dismiss as benchmark overfitting.

Stripe ran the most visible early proof of concept. The company handed Fable 5 a 50-million-line Ruby codebase and asked for a migration. According to Forbes, the model completed the job in a day, work that would have taken a human team roughly two months. What changed is not raw throughput but something more structural: the model holds an entire project in context, plans across many dependent steps, runs for hours without a human checkpoint, and catches its own errors mid-task.

The architecture of the release

Anthropic is shipping two distinct products from the Mythos tier simultaneously. Claude Fable 5, the publicly accessible version, carries conservative safety guardrails that automatically route sensitive queries about cybersecurity, chemistry, and biology to Claude Opus 4.8 instead. Anthropic expects that fallback to trigger in fewer than five percent of sessions on average, which should be negligible for most enterprise workloads.

Claude Mythos 5 uses the same core model with some of those restrictions lifted. It is currently limited to a small group of vetted cyberdefenders and infrastructure operators enrolled in Project Glasswing. According to MacRumors, Apple is already a Glasswing partner. Anthropic describes Mythos 5 as carrying the strongest cybersecurity capabilities of any publicly disclosed model, a claim consistent with the SWE-bench gap but not yet independently replicated.

Scientific performance claims

Beyond code, Anthropic reports that Mythos 5 delivered a 10x acceleration in portions of a drug-design pipeline. In a separate evaluation, the model ran a week-long autonomous genomics research task and reportedly outperformed a recently published specialist model while using a hundred times fewer parameters. On open-ended hypothesis generation, researchers preferred Mythos 5 output roughly 80% of the time in blind comparisons, according to Yahoo Finance.

All of these figures come from Anthropic's own reporting. No third-party replication has surfaced yet, so treat them as directional benchmarks rather than settled results. Still, the pattern is consistent with the SWE-bench number: performance compounding on long, multi-step tasks rather than single-turn prompts.

Pricing and the June 23 cliff

Fable 5 is priced at $10 per million input tokens and $50 per million output, less than half the cost of the Mythos Preview it effectively replaces. The May 2026 Ramp artificial intelligence index, compiled from corporate card and invoice data across more than 50,000 US companies, already shows Anthropic at 34.4% enterprise adoption versus OpenAI at 32.3%, suggesting the commercial momentum predates this launch.

Current subscribers on Pro, Max, Team, and Enterprise plans get Fable 5 included through June 22. Starting June 23, access shifts to usage credits while Anthropic scales infrastructure, with no public timeline for when plan bundling resumes. Teams planning production deployments should also note the mandatory 30-day data retention requirement attached to all Mythos-class usage, which exists for safety monitoring and carries compliance weight in regulated sectors.

What practitioners should watch

Fable 5 is designed for the workloads where prior generations hit walls: long-horizon engineering tasks, multi-document contract review, scientific literature synthesis, financial audits spanning thousands of pages. For most applied artificial intelligence teams, the capability case is straightforward given the benchmark and case-study evidence on hand.

Safety guardrails are the friction point. Coverage aggregated by Price Per Token notes that cybersecurity researchers are already pushing back against the current restrictions. Queries touching offensive security, chemistry, and biology fall back to Opus 4.8, and Mythos 5's unrestricted tier is not openly accessible. Anthropic's position is that broad access with guardrails produces more aggregate benefit than narrow access without them. That argument will face sustained pressure from security practitioners who need full capability as a baseline for doing their jobs.

If the autonomous-research benchmarks hold up to independent review, the conversation shifts from whether this model tier changes knowledge-intensive work to how quickly organizations can restructure workflows around it. The harder problem may not be the model at all.

FAQ

What is SWE-bench Pro and why does it matter?
SWE-bench Pro evaluates models on real GitHub issues that require actual code changes to resolve. Fable 5 scored 80.3%, versus 69.2% for Claude Opus 4.8 and 58.6% for GPT-4, with the gap widening on longer problems.

What is the difference between Claude Fable 5 and Claude Mythos 5?
Both run the same underlying model. Fable 5 is publicly available with guardrails that route sensitive queries to Opus 4.8. Mythos 5 has fewer restrictions but is limited to vetted Project Glasswing partners, currently cyberdefenders and selected biology researchers.

How is Claude Fable 5 priced and who gets access?
Pricing is $10 per million input tokens and $50 per million output. It is included in Pro, Max, Team, and Enterprise plans through June 22, 2026, after which usage credits apply until capacity expands.

What safeguards does Fable 5 have for sensitive topics?
Cybersecurity, biology, and chemistry queries automatically fall back to Opus 4.8. Anthropic says this activates in under 5% of sessions. All Mythos-class enterprise usage also carries a mandatory 30-day data retention requirement for safety monitoring.

About the Author

Guilherme A.

Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.

Connect on LinkedIn