TL;DR
Two Chinese labs ship competing frontier models in late May with opposing strategies: Tencent's ultra-cheap API play versus Alibaba's open-source million-token architecture.
Pricing tells the story before the benchmarks do. Tencent's Hunyuan HY3 Preview arrived on Price Per Token at $0.06 per million input tokens and $0.21 per million output tokens, served through GMICloud's inference API. Less than a week earlier, Alibaba's Qwen3.7 Max landed as an open-source release with a 1 million token context window, priced at $2.50 input and $7.50 output through Together AI and also available directly from Alibaba. Same geography, sharply different bets on where the market is heading.
Hunyuan HY3 carries a preview label, signaling it is not yet a production-grade release. Its 262K context window is functional but unremarkable relative to the current frontier. What is remarkable is the cost: at those per-token rates, HY3 sits below most comparable hosted inference providers, suggesting Tencent is optimizing for adoption and volume rather than positioning against premium models. Independent benchmark results have not been published alongside the release, so capability claims are unverified at this point.
Qwen3.7 Max targets the opposite end of that tradeoff. A genuine 1 million token context window opens up workflows that matter for enterprise document processing, codebase-level reasoning, and long-horizon agentic tasks. The model continues the Qwen lineage that the AI Release Tracker lists among the most consistently deployed open model families in practitioner infrastructure, tracking cumulative adoption across 160 frontier releases since 2022.
The open weights question
Open release is doing real structural work here. When Alibaba publishes Qwen3.7 Max weights publicly, it effectively recruits the community for fine-tuning, red-teaming, and domain adaptation at no cost to the lab. The $2.50/$7.50 pricing through Together AI applies only to hosted inference. Teams running their own GPU infrastructure pay only for compute, making the total cost of ownership significantly lower for high-volume deployments.
That asymmetry has proved decisive in earlier Qwen cycles. The Humanity Redefined newsletter has documented how prior Qwen releases became default backbones for retrieval-augmented and coding pipelines precisely because the weights were accessible and the context window was already competitive when the models first shipped.
For practitioners deciding between the two right now, the choice is less about raw capability and more about deployment architecture. HY3's sub-ten-cent input pricing makes it a candidate for high-throughput pipelines where cost per token drives infrastructure design. Qwen3.7 Max suits teams that need full model control, want to fine-tune on proprietary data, or are building applications where a genuine million-token window changes what is architecturally possible.
Context windows at this scale change the retrieval calculus. Instead of chunking a codebase into a vector store and querying it through embedding-based retrieval, a model with a verified 1M context can reason over the full repository in a single call. Whether Qwen3.7 Max delivers that in practice under real workload conditions is still an open question; the spec is there, but practitioner evals have not yet accumulated.
What the releases leave open
Neither model shipped with published, reproducible benchmark suites. The Price Per Token tracker notes GPQA Diamond, SWE-Bench Verified, and MMMU scores for models where labs have released them; both HY3 and Qwen3.7 Max entries carry no such figures as of this writing. Teams doing due diligence will need to run internal evaluations before committing either model to production pipelines.
The structural pressure from Chinese labs releasing competitive artificial intelligence systems at aggressive price points is not new, but it is intensifying. Each release narrows the window where any lab can claim an unchallenged gap on cost or context length alone. Whether Tencent's pricing aggression or Alibaba's open-weights strategy proves more durable as a moat is what these two releases put on the table, without yet answering it.
Community evals over the next two to three weeks will matter more than today's spec sheets.
Frequently asked questions
What is Tencent Hunyuan HY3 and how is it priced?
Hunyuan HY3 is a preview-stage language model from Tencent, available via GMICloud at $0.06 per million input tokens and $0.21 per million output tokens, with a 262K token context window. No benchmark scores have been published alongside the release.
Is Qwen3.7 Max open source?
Yes. Alibaba released Qwen3.7 Max as an open-source model, meaning the weights are publicly available. Hosted inference runs through Together AI at $2.50 input and $7.50 output per million tokens, and also through Alibaba's own platform.
What does a 1 million token context window enable that smaller windows do not?
At that scale, models can ingest entire codebases, lengthy legal corpora, or extended multi-document sets in a single inference call, reducing or eliminating the need for chunking and vector-retrieval pipelines in many applications.
Why is HY3 so much cheaper than Qwen3.7 Max?
Pricing reflects positioning. HY3's low per-token cost targets high-throughput API use cases where volume matters more than capability ceiling. Qwen3.7 Max's hosted price covers managed inference infrastructure; users running open weights on their own compute avoid per-token costs entirely.
About the Author
Guilherme A.
Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.
Connect on LinkedIn