O3 model compute cost seems high

Ive read in multiple places that the new o3 model is significantly more expensive during inference, using about 55k tokens per reasoning step and costing about $20 per prompt? I also read that it costed o3 mini nearly 10k to get 76% on the ARC AGI benchmark and o3 costed 172 times more. I’m not sure how many questions are in the ARC AGI test but that seems very high.

Is this actually true? In the recent o3 demo it didn’t take very long to reason so its hard to imagine it churning through that many tokens to cost this much.

Why would OpenAI provide false information about their models? It’s obviously using many times the computing resources of any previous public model, which consumes tokens faster. The impressive thing is that OpenAI is able to fully utilize that much computing power and produce close to AGI-level responses. Of course the price will decrease as optimizations are found and computers improve.

I can’t find any official recount from OpenAI that these models take that much compute, so wondering where they said this, or is it published by the ARC team?

Not saying it isn’t amazing… I’m guessing it’s calling many reasoning branches in parallel and somehow choosing the best path judging by how quick it is but also consuming more tokens