New 4 companies · First observed November 2023 · Updated June 2026

Token prices fall as access tiers rise

Quick answer

AI API token prices fall with almost every model generation — frontier rates dropped roughly 10× from the GPT-4 era to the GPT-5 era — while the price of human-facing access climbs toward a $200/mo ceiling. Compute deflates; access inflates.

~10× cheaper per token, GPT-4 → GPT-5 era

What's happening — and why

What's happening: the cost of calling an AI model through its API keeps dropping. Each new model generation tends to launch cheaper than the one it replaces, and vendors add further discounts — prompt caching, batch processing — in between releases.

Why: raw inference is becoming a commodity. Open-weight models set aggressive price floors, serving efficiency improves, and competition is intense, so vendors pass the savings to developers to win volume. They keep their pricing power for human-facing subscriptions instead, where willingness-to-pay is far higher than for a raw token.

How it works

$/1M tokens time → GPT-4 4o GPT-5 $200 access tier
Per-token API price falls each model generation (blue); human-facing access climbs (amber).

Evidence over time

6 supporting · 2 counter — hover or tap a point for detail, click to jump to the row.

supports ↑ challenges ↓ 2023 2024 2025 2026
supporting evidence counterexample

Evidence

Company Date What happened
OpenAI Nov 2023 GPT-4 Turbo launched ~10× cheaper than the original GPT-4.
OpenAI May 2024 GPT-4o launched ~50% cheaper than GPT-4 Turbo; 4o-mini (2024-07) pushed frontier-ish quality to commodity price.
Google Nov 2024 Gemini 1.5 token pricing cut ~50–65%.
DeepSeek May 2024 V2 at $0.14/1M input — a market-shock price that reset expectations.
DeepSeek Dec 2024 V3 at $0.27/1M — frontier-class performance at commodity rates.
Anthropic Aug 2024 Prompt caching cut input cost by up to 80% without a model change.

Counterexamples

  • You.com · Sep 2025 — Top consumer tier repriced ~6.7× — $30 Team replaced by $200 Max.
  • Character.ai · Feb 2026 — Introduced mid-chat ads for free users — monetisation rising, not falling.

For buyers

Re-baseline token-cost assumptions at least twice a year — a model you priced six months ago is usually cheaper now, and a newer model may be cheaper and better. Don't extrapolate the deflation to seats: model the workload as raw metered tokens (falling) separately from per-seat access (rising), and pick the cheaper envelope deliberately.

For vendors

To ride deflation you need per-model rate cards that can change without breaking contracts, plus structural discounts (caching, batch) to cut effective price without touching headline rates. To monetise access, a premium seat tier with priority or uncapped usage captures the value that falling tokens leave on the table.

Outlook — what to watch

Expect another leg down: open-weight models (DeepSeek, Llama-class) keep resetting the floor, and caching/batch discounts are spreading. The deflation only stalls if frontier capability stops commoditising — watch whether any vendor can hold a premium token price on a genuinely differentiated model. Access tiers still have room to climb.

Bottom line

Compute is deflating relentlessly while access inflates. Every frontier vendor in the corpus has cut per-token prices at least once; the value is migrating from the token to the seat.

FAQ

Are AI API prices going up or down?

Down. Every frontier vendor in the corpus has cut per-token API prices at least once — usually at each model release — and layered caching and batch discounts on top. Consumer subscriptions are the exception; those are rising.

Why are ChatGPT and Claude subscriptions getting more expensive if tokens are cheaper?

The token (raw compute) and the seat (human access) sit on different curves. Vendors pass compute deflation to API developers but capture value from power users through premium ~$200/mo access tiers.

How often should I re-check my model costs?

At least twice a year. A model you priced six months ago is usually cheaper now, and a newer one is often cheaper and better — Claude 3.5 Sonnet beat Opus at a fraction of the price.

All trends