Gpt-4-vision-preview 'low' detail resizing

tmonk · November 13, 2023, 3:31pm

The gpt-4-vision documentation states the following:

low will disable the “high res” model. The model will receive a low-res 512 x 512 version of the image, and represent the image with a budget of 65 tokens. This allows the API to return faster responses and consume fewer input tokens for use cases that do not require high detail.

Does that mean that the image is resized proportionally to fit inside a 512 x 512 square, or the image is transformed to that shape?

That is, when proportion matters (perhaps it does generally), should I resize my images to fit inside a 512 x 512 square ahead of time. Or does the resizing backend take this into account?

Foxalabs · November 13, 2023, 3:39pm

Hi and welcome to the Developer Forum!

If you wish to resize and crop images, then that’s great, if not the system will automatically manipulate the image to deal with that.

tmonk · November 13, 2023, 3:46pm

Thanks for the reply. I understand that it will do it automatically - my question is more what is done automatically, i.e. how is the resizing/cropping performed. Is proportion taken into account?

Foxalabs · November 13, 2023, 3:51pm

I don’t know the exact internals but I imagine it would work like any picture viewer, if you have a very wide image being viewed in a 512x512 screen it would have lots of the blank space at the top and bottom. A simple shrink to fit would seem to be appropriate.

tmonk · November 13, 2023, 3:53pm

I agree. But the documentation states explicitly that it is resized to 512 x 512 . So I was looking for some confirmation here re. the backend.

Foxalabs · November 13, 2023, 3:56pm

right, so the only algorithm that could be applied to any generic image in order to comply with that would be a proportional shrink to fit, now could it be a stretched image? possibly, but the model does seem to be aware of proportionality and aspect ratio, so that rules that out in my mind, leaving only shrink to fit.

My guess? It’s an OpenCV image shrink to fit call.

Topic		Replies	Views
Using gpt-4o, what size are large images resized to API gpt-4	3	473	November 5, 2024
How Does the GPT-4V API deal with large Images? API gpt-4 , gpt-4-vision	0	1033	January 22, 2024
Token Usage for Images Remains Constant Regardless of Size - Is This a Bug? API	6	358	September 23, 2024
GPT-4 Vision pre-classification Image Cropping. Salience, or facial, or character density Community gpt-4-vision	2	622	April 8, 2024
Why detail low-res and high-res not the same on 512x512 images? API gpt-4-vision	1	2135	April 10, 2024

Gpt-4-vision-preview 'low' detail resizing

Related topics