Z-ai Debuts 1M-Context GLM-5.2 at $1.40 Input Token

TL;DR

Z-ai's GLM-5.2 offers 1M context for $1.40 input token, joining Google's $75M A24 partnership and Anthropic's Alibaba extraction accusations in a busy AI week.

Z-ai's GLM-5.2 model launched on OpenRouter yesterday offering 1 million token context for $1.40 input and $4.40 output per million tokens. The pricing positions the Chinese AI lab competitively as the field races toward longer context windows.

The model arrives amid intense activity in AI development. Google announced a $75 million investment in film studio A24 for AI research partnership, while Anthropic accused Alibaba of orchestrating the largest known extraction attack against its Claude models.

GLM-5.2's 1M context matches industry leaders like Gemini 3.1 Flash Image and stepfun's step-3.5-flash. At $1.40 input cost, it undercuts some competitors but remains above the cheapest options available.

The pricing reflects broader trends tracked by AI Release Tracker, which documents 160 frontier models from major labs since ChatGPT's November 2022 launch. Context windows have grown steadily from initial 4K limits to today's million-token capabilities.

Google's move into film financing represents unusual territory for the tech giant. The $75 million partnership with boutique studio A24 aims to develop AI filmmaking tools through DeepMind's research division.

A24 will work directly with DeepMind researchers on new workflows and techniques, marking Google's first stake in a movie studio despite owning YouTube through Alphabet.

Meanwhile, Anthropic revealed what it calls the largest extraction attack ever against its technology. The company alleges Alibaba deployed 25,000 fake accounts to generate 28.8 million fraudulent interactions with Claude over 44 days.

This distillation attack sought to replicate Claude's capabilities for use in Alibaba's proprietary systems. Anthropic contacted US Senators and the Commerce Department responded by imposing global restrictions on Mythos and Fable models.

These developments highlight three critical shifts in AI development. First, context window expansion has become the primary differentiator as models mature. Second, the industry is grappling with IP theft through sophisticated extraction techniques. Third, tech companies are moving beyond software into creative industries.

For practitioners, the key takeaway is that context length remains expensive despite rapid improvement. The $1.40 input cost for 1M tokens means processing a 500-page document costs roughly $0.70, making large-scale document analysis still costly for many applications.

As the AI field consolidates around a few major players, expect more partnerships like Google-A24 and more conflicts over intellectual property extraction.

What context length can GLM-5.2 handle?
Z-ai's GLM-5.2 offers 1 million token context window, matching top-tier models from competitors.

How much does 1M tokens cost to process?
$1.40 input per million tokens and $4.40 output per million tokens on OpenRouter.

What happened with Google and A24?
Google invested $75 million in AI research partnership with boutique film studio A24 for filmmaking tools.

What did Anthropic accuse Alibaba of?
Orchestrating largest known AI model extraction attack using 25,000 fake accounts generating 28.8 million fraudulent Claude interactions.

About the Author

Guilherme A.

Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.

Connect on LinkedIn