I didn’t find any information about the image to provide to ChatGPT4o.
What’s the best DPI ? What’s the best size of the image ?
I think that it exists a good balance between the size of the image and the latency of the response.
And what’s the conversion between the size of the image and the tokens ?
OpenAI pricing page has a calculator for image input tokens
- 85 tokens at detail:low
- 255 - 1445 tokens at detail:high depending on resolution.
The detail low option uses a single 512x512 or smaller image.
The detail:high first samples the 512x512, then gives the AI tiles of larger images up to 768x2048.
The quality received will be matching the input image to training data. A single word in the middle of 1024x1024 may perform poorly.
If you can see what needs to be described in an image when the largest dimension is 512px, you can go for that for lowest cost and fastest and most straightforward vision processing.