As per this doc
By controlling the
detail
parameter, which has three options,low
,high
, orauto
, you have control over how the model processes the image and generates its textual understanding. By default, the model will use theauto
setting which will look at the image input size and decide if it should use thelow
orhigh
setting.
What will be exact logic here used by auto
to choose low or high setting ? This says based on the image input size. Can someone give more precise definition e.g if input size is less than 512*512 then it choose low else it will choose high ? So far from my testing without the detail
property which is by default auto
value it always uses high
as the value
I tried with the image size 360*220 with auto it chooses high. Mostly it chooses high and not low