OpenAI Safety Fellowship funds AI alignment research

TL;DR

Apply for a six-month fellowship with stipends, model access, and support to advance AI safety and alignment research.

OpenAI is paying outside researchers to study AI safety, starting this September. The company's new Safety Fellowship covers a six-month window, running from September 2026 through February 2027, with the explicit aim of broadening the number of practitioners working on technical alignment problems.

Eligibility is wide. Researchers, engineers, and practitioners with no current affiliation to OpenAI can all apply. Per Campus Technology, selected fellows receive stipends, direct access to OpenAI's production models, and technical support, against a deliverable requirement: papers, benchmarks, or datasets. Research focus areas include robustness, privacy, agent oversight, and misuse prevention.

Agent oversight is worth isolating. As AI systems take on longer-horizon autonomous tasks, the gap between intended behavior and actual execution has grown harder to track without dedicated tooling and systematic empirical work. A fellowship that funds precisely that research, with frontier model access, is structured to address a gap that most academic groups cannot close on their own.

The regulatory backdrop

Pressure on AI companies has accelerated. The EU's Artificial Intelligence Act now imposes compliance requirements on high-risk systems across member states, and scrutiny from regulators, civil society, and the research community has intensified. Funding external reviewers is one response to that pressure: a signal that safety work is not confined to internal red teams.

Anthropic, whose founding thesis centered on safety-first artificial intelligence development, has operated a similar program for some time, supporting independent work in alignment, interpretability, and AI security with public output requirements. That company has also been aggressive on developer tooling, most recently shipping Claude Code updates that include scheduled automations running on cloud infrastructure. The labs competing hardest on safety research are simultaneously competing hard on product velocity.

The pace of new model releases adds urgency to this kind of work. llm-stats tracked multiple major frontier model releases in the first two weeks of April 2026 alone, from Anthropic, Meta, Google, and others. Each new generation raises fresh questions about whether existing safety evaluations still measure the right behaviors, and whether benchmarks designed for one model family transfer to the next.

NVIDIA represents a different approach to related concerns, releasing open Nemotron Safety models that companies including CrowdStrike and Fortinet have deployed to improve application-level trustworthiness. That is the productized downstream layer. OpenAI's fellowship targets the upstream research layer, where findings feed into evaluation methods and, eventually, into policy.

What the structure reveals

For a researcher weighing whether to apply, the six-month deliverable requirement suggests the program rewards people who arrive with concrete hypotheses and methods, not exploratory agendas. API access to OpenAI's models at scale is the practical lever here. Most academic groups working on artificial intelligence review and red-teaming cannot budget production-scale model access, and the fellowship removes that constraint directly.

The program has not yet published a conflict-of-interest policy. Whether fellows can produce and publish findings that reflect poorly on OpenAI's systems, free of editorial interference, remains unaddressed publicly. That detail matters more to serious safety researchers than the stipend amount, and it will determine whether the fellowship attracts the most rigorous candidates or settles for work unlikely to create friction.

Applications are open now for the September cohort. A persistent annual cycle, replicated across several labs, could eventually create a meaningful layer of safety review sitting outside any single company's institutional interest. Whether this fellowship marks the beginning of that structure, or remains a well-funded gesture, depends entirely on how the independence question gets resolved in practice.

FAQ

What is the OpenAI Safety Fellowship?
A six-month funded program running September 2026 to February 2027 that pays external researchers stipends and grants access to OpenAI's production models to conduct safety and alignment research. Fellows are expected to deliver papers, benchmarks, or datasets.

Who is eligible to apply?
Any researcher, engineer, or practitioner not currently affiliated with OpenAI. The program does not appear to restrict by career stage, making it open to academic researchers and independent practitioners alike.

What research areas does the fellowship cover?
The specified focus areas are robustness, privacy, agent oversight, and misuse prevention. These map closely to the practical challenges that arise as AI systems are deployed in autonomous, long-horizon settings.

How does this compare to Anthropic's safety research program?
Anthropic runs a structurally similar fellows program supporting alignment, interpretability, and AI security, with public output requirements. OpenAI's program adds direct access to frontier production models as a core resource, which is the element most academic groups currently lack.

About the Author

Guilherme A.

Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.

Connect on LinkedIn