Price Per Token launches MCP to stream live LLM pricing data

TL;DR

Price Per Token's new Model Cost Portal offers live per‑token pricing for dozens of LLMs, helping developers optimise API spend and benchmark model performance.

A single line of code now lets developers query dozens of LLM providers in real time, and the cheapest answer wins. That is the promise of the Model Cost Portal (MCP) launched today by Price Per Token, a startup that has been tracking AI model releases and pricing for the past year.

MCP aggregates per‑token inbound and outbound rates from providers such as OpenAI, Anthropic, Google, and emerging open‑source offerings. The dashboard updates every few seconds, reflecting promotional changes, region‑specific pricing, and usage‑tier discounts. Users can paste a snippet of code into their stack, and the portal returns a ranked list of models with total cost estimates for a given prompt length and context window.

The service builds on Price Per Token’s existing "New Models Today" feed, which already lists over 100 model releases in the last 24 hours, from Gemini 3.1 Flash Lite Image to Claude Sonnet 5 on AWS. By exposing the raw $/token numbers alongside latency and context limits, MCP turns pricing from a static document into a live signal that can be fed into routing logic or cost‑optimisation scripts.

Developers who have struggled with opaque pricing sheets will find immediate value. A typical workflow now involves pulling the MCP JSON endpoint, filtering for models that meet a latency budget, and selecting the lowest‑cost option for each request. Early adopters report up to a 30 % reduction in monthly API spend simply by swapping a high‑cost model for a comparable open‑source alternative when the price gap widens.

The portal also surfaces hidden fees that often trip up budgeting. For example, Anthropic’s Claude Sonnet 5 lists a $2.00 inbound and $10.00 outbound rate on Amazon Bedrock, while Google’s Nano Banana 2 Lite charges $0.25 inbound and $1.50 outbound. By juxtaposing these figures, MCP makes it clear when a provider’s inbound cost is negligible but outbound usage dominates the bill.

Beyond cost, MCP includes basic performance metadata: context length, token throughput, and whether the model supports image or multimodal inputs. This allows engineers to balance price against capability without consulting separate documentation pages. The portal’s API returns a uniform schema, simplifying integration with existing monitoring tools.

The launch arrives at a time when LLM providers are rapidly expanding their catalogs. In the past week alone, the price‑tracking feed recorded releases ranging from the open‑source Qwen 3.5 series to proprietary models like Gemini 3.1 Flash Lite Image. With pricing volatility expected to increase—especially as providers experiment with tiered pricing and volume discounts—real‑time data becomes a competitive advantage.

Analysts note that MCP could shift the economics of model selection. Historically, developers defaulted to a single vendor to avoid integration overhead. Live pricing data lowers the friction of multi‑vendor strategies, encouraging a marketplace where cost efficiency competes directly with performance. This may accelerate the adoption of smaller, specialised models that were previously overlooked due to higher perceived integration costs.

However, the portal’s reliance on publicly advertised rates means it cannot account for enterprise contracts or hidden usage caps. Companies with negotiated discounts will see a mismatch between MCP estimates and actual invoices. Price Per Token advises users to treat the portal as a guide rather than a billing oracle.

Looking ahead, the team plans to add predictive pricing alerts and a sandbox for simulating batch workloads. If the service gains traction, it could become a de‑facto standard for cost‑aware LLM orchestration, much like how spot‑instance pricing dashboards reshaped cloud compute procurement.

What this means for practitioners is simple: the cheapest model is no longer a guess hidden in a PDF. With MCP, cost can be baked into the request path, enabling dynamic routing that reacts to market shifts in seconds rather than weeks.

FAQ

What is the Model Cost Portal?
MCP is a real‑time dashboard and API that aggregates per‑token pricing from major LLM providers, allowing developers to compare costs instantly.

How does MCP update its prices?
The platform polls provider pricing endpoints and public announcements every few seconds, reflecting promotions, tiered discounts, and regional variations.

Can MCP handle enterprise discount contracts?
MCP shows publicly listed rates only; custom enterprise pricing must be applied manually by the user.

Is there a free tier for MCP?
Price Per Token offers a limited free tier with access to the public dashboard; higher request volumes require a paid subscription.

About the Author

Guilherme A.

Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.

Connect on LinkedIn