Hello, does anyone have any insight or clarification (updated please) for the pricing on Image input/Vision?
I’ve read older things about how 4o-mini cost more, and that the token count was increased per image, etc. I’ve also seen, and played around with, the calculator that was on the Pricing page, which calculator is now gone.
I’ve been doing my own testing, both with some CURL scripts, the Chat Playground, and some other app-based API calls. I’ve been finding constant results from the API sever, in terms of “prompt_tokens” but they are not consistent with what’s on the Documentation, or what I’ve previously understood.
For example, I’ve not once been able to get “prompt_tokens” to return as anything close to 85, for a single image. I’ve also done the math for tiled images, and uploaded larger images (with high detail instead of low) and they don’t add up to 85 + 170 + 170, etc.
Basically, it seems like the Documentation is wrong/outdated, but I don’t know what to think. I need some confirmation on things, in terms of pricing, as using Vision is a big expense for my application at the moment.
Also, prior to recently, when using 4o-mini, it seems that it adjusted the tokens count so that it was something like 2800 (or something) which seemed like an adjustment to the tokens, to match whatever 4o cost. However, during my recent testing, I found no differences in “input_tokens” when comparing the same requests against 4o, or 4o-mini. They were always about same amount of tokens.
Has the pricing changed? Are images processed by 4o-mini actually cheaper now? What’s the cost per image (512x512 or otherwise)? What about the high resolution, or tile costs? What are those set at now?
Any feedback from anyone would be nice. My instinct is to trust the documentation, and that my life/current/right now “prompt_tokens” is somehow wrong. It’s not like I can see super accurate real-time updates on the Pricing/Usage page. So, I don’t know what to think.
tldr; Is the documentation about Vision pricing outdated? I’ve been having MUCH different results than what I’m reading.