Why and how is it even logical that o1 is 10x+ more expensive than o3, when o3 is supposed to be better?

Hi everyone,
I’m genuinely confused by OpenAI’s API pricing here and I’m hoping someone (ideally OpenAI) can explain the rationale.

From the pricing tables:

  • o1: $15 / 1M input, $60 / 1M output

  • o3: $2 / 1M input, $8 / 1M output

So o1 is ~7.5x more expensive than o3 on both input and output — and in practice it feels like 10x+ depending on usage patterns.

But at the same time, o3 is positioned as the newer / stronger reasoning model, so:

Why would the older model cost dramatically more than the newer model that’s supposed to perform better?
How is that pricing logical?

You’re assuming something that has never really been true for OpenAI’s API pricing:

“Newer + stronger model ⇒ must be more expensive.”

Historically that’s just wrong.

Take gpt-4-32k as a concrete example. Its official pricing was:

  • $60 / 1M input tokens
  • $120 / 1M output tokens

That’s dramatically more expensive than both o1 ($15 / $60) and o3 ($2 / $8). And yet gpt-4-32k is strictly weaker than the current frontier models.

So if you apply your logic consistently, the old, weaker gpt-4-32k being 3–10x the price of today’s stronger models would make even less sense. But that’s exactly how pricing has looked in the past: newer models tend to become both better and cheaper over time.

The reason is simple:

  • Pricing is about product tier and compute cost, not “publishing order.”
  • o1 is a specialized, slower, more compute-heavy deliberate reasoning product.
  • o3 is optimized to be the default workhorse: fast, cheap, good-enough reasoning for most use cases.

If you don’t need the deliberate behavior of o1, just use o3 and enjoy the fact that in 2025 you’re paying a fraction of what people paid for gpt-4-32k, for far better performance. The “why isn’t the newer model more expensive?” complaint ignores the entire history of API pricing.

If you really think “newer + stronger ⇒ must be more expensive” then you’re going to have a hard time explaining Anthropic’s pricing too.

Compare just these two:

  • Claude 3 Opus (older flagship “big brain” model):

    • $15 / 1M input tokens
    • $75 / 1M output tokens
  • Claude 4.5 Haiku (newer “small” model with near-frontier code + reasoning):

    • $1 / 1M input tokens
    • $5 / 1M output tokens

So Anthropic’s newer, extremely capable Haiku 4.5 is literally ~15x cheaper than their older Opus 3, and it’s good enough to replace Opus in a ton of real workloads.

Exactly the same pattern you see with OpenAI:

  • high-end “deliberate / frontier” models are priced as a premium tier,
  • smaller, highly optimized newer models are positioned as fast/cheap defaults.

In other words, across vendors the actual pattern is:
“model role + compute profile ⇒ price,”
not
“release date or version number ⇒ price.”

If you want to argue the pricing is bad, fine — but the “why is a newer model cheaper than an older one” line just ignores how literally everyone in the industry prices their model families.

1 Like