O3 is 80% cheaper and introducing o3-pro

One seeming benefit of o3-pro seen - at least you aren’t the one paying for hundreds of tokens of unseen decision and moderation as in other reasoning models.

O3-Pro

36 output tokens billed on 26 tokens received.

Still with apparent 85 tokens for vision base tile instead of 75 tokens.

O1-Pro

193 more output tokens billed than received:

Note the peculiar vision input billing of o1-pro, also seen in o1. An image 512x512 would be 1 tile (75 or 85 tokens) says the pricing guide. Here however, a detail:low image is always min/max 22 tokens with container overhead, and detail:high as showing is 41 tokens with its text. 512x513 is a jump to 63, 22 tokens more input. Perhaps a price break because of the stratospheric cost otherwise? At the very least, o1 is undisclosed and unpublished vision pricing formula.


Adding images has added latency of around 1-3 seconds across all other models. So still with these 15 second response times, there’s either a queue, there is unseen moderations or decisions before your billed task…or OpenAI figured out how to publish a model with 3 token-per-second generation rate.