AIResearchAIResearch
Machine Learning

Hexo Labs Releases SIA, an Open-Source Self-Improving Agent

Hexo Labs releases SIA, an open-source self-improving AI agent claiming 350x benchmark gains, raising questions about methodology and independent replication.

3 min read
Hexo Labs Releases SIA, an Open-Source Self-Improving Agent

TL;DR

Hexo Labs releases SIA, an open-source self-improving AI agent claiming 350x benchmark gains, raising questions about methodology and independent replication.

Hexo Labs, a Palo Alto research startup founded three years ago by serial entrepreneurs Kunal Bhatia and Vignesh Baskaran, released an open-source agent Thursday that it claims can improve its own capabilities without human guidance. The system, called SIA for Self-Improving AI, posted a 350-fold speed gain on a benchmark created by OpenAI.

The core claim cuts against how most artificial intelligence development works today. Current systems, however capable, still require domain experts to interpret experimental results, select the next hypothesis to test, and decide when to change direction. SIA is designed to handle that loop autonomously: generate hypotheses, run experiments, evaluate outcomes, update strategy, and repeat. According to Yahoo Finance, Hexo frames this as removing the human bottleneck from the path to superintelligence.

Bhatia states the logic directly: superintelligence will not come from static models. Each cycle SIA completes is meant to compound its capability rather than require fresh human input to move forward.

The mechanics and the limits

Hexo draws its lineage from two well-known self-play systems. AlphaGo, developed at Google DeepMind, mastered the board game Go by generating and evaluating millions of games against itself. IBM's Deep Blue defeated chess grandmaster Garry Kasparov through a similar search-and-evaluate process. Both are existence proofs that autonomous iteration can exceed human-level performance in constrained domains, and SIA intends to extend that principle to open-ended problems in science, engineering, and business, where improvement cycles are possible but success criteria are far less crisp.

That gap is what practitioners should scrutinize first. The 350X figure comes from an OpenAI-designed benchmark, not an independent review, and Hexo has not released evaluation details that would allow outside researchers to replicate the result. Anyone who regularly reviews artificial intelligence benchmarks through trackers like LLM Stats will recognize the pattern: headline numbers without reproducible methodology are a starting point for investigation, not a conclusion.

Hexo says it is partnered with researchers at Stanford University and the University of Oxford, though the scope of those collaborations was not detailed in the launch announcement.

Open source in a consolidating market

Releasing SIA under an open-source license is a deliberate stance in a market rapidly moving toward concentration. This week, CNBC reported that Anthropic closed a $65 billion Series H at a $965 billion valuation, nearly tripling from its February figure of $380 billion, with Amazon contributing $5 billion to the round. Crypto Briefing noted the round also drew hyperscaler investments in compute infrastructure. Capital at that scale funds proprietary training runs and safety research that small labs cannot match, and an open-source release puts SIA's architecture in researchers' hands for the kind of adversarial testing that closed systems avoid by default.

Practitioners will want external results before updating workflows. AlphaGo and Deep Blue operated in worlds with perfect information and fixed rules. Scientific and engineering problems rarely offer either, and self-improvement loops in ambiguous domains tend to drift toward local optima unless reward signals are carefully constrained. Hexo will need to produce results on third-party tasks with independent evaluation for the "world's first" framing to carry weight with skeptical engineers.

Larger picture

The timing is meaningful. As frontier artificial intelligence development consolidates around a handful of well-capitalized labs, the question of whether a self-improving agent can survive open-source scrutiny is worth watching closely. If SIA's loop design holds up, it becomes a reference implementation that larger programs will need to respond to. If the benchmark claims fail to replicate, it joins a long record of improvement-speed metrics that dissolved under review.

SIA's code is now public. That review starts here.

FAQ

What is SIA and who built it?
SIA, or Self-Improving AI, is an open-source agent from Hexo Labs, a Palo Alto startup founded by Kunal Bhatia and Vignesh Baskaran. It runs continuous loops of hypothesis generation, experimentation, and self-evaluation, designed to improve task performance without human intervention between cycles.

What does the 350X claim actually measure?
Hexo Labs states that SIA improves at 350 times the speed indicated by an OpenAI-designed benchmark. The specific benchmark, its baseline, and evaluation criteria have not been published, making independent replication impossible at this stage.

How is this different from AlphaGo or standard reinforcement learning?
AlphaGo and Deep Blue are domain-specific systems that improve within fixed rule environments with clear win conditions. Hexo claims SIA generalizes to open-ended tasks in science and engineering, which is the key empirical claim the open-source release will need to support.

Where can researchers access SIA?
Hexo Labs released SIA as open source, making the code publicly available for inspection, testing, and replication by the research community.

About the Author

Guilherme A.

Guilherme A.

Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.

Connect on LinkedIn