Strange Responses from Vision API

Getting random responses on blank image with just lines and box with space to write something but it empty no hand written text on it. I m trying to use this API to extract hand written text form the images.

Hi and welcome to the Community!

Can I just clarify: your input in this particular example was a blank image? If yes, then what you experience is a normal hallucination akin to what would happen if you provided the model with an empty text string as input.

Is my understanding correct that you have situations when there is handwritten text on the image AND situations where the image remains blank? If so, then you should clarify this in your prompt and additionally ask the model to simply return a default response such as “no handwritten text” detected in situations where the image is blank.

Feel free to provide your existing prompt for reference.

I cant share the original image, but the image has a box with single space for writing English answer to some questions. So its not totally white / blank image, it is an image with space to write answer in hand writing, but nothing written on it.

So its a unexpected random response from the API, on one occasion i received this.

This may be a case where you don’t have enough system prompt to guide the AI in its task, where by not setting an identity and a job for the AI to do as an entity, it is more likely to produce the likely language response to the input than to actually examine the image - if the image was sent.

A grainy or shaded image also allows some embeddings to be activated to power a hallucination.

Without pasting into some scripts, adjusting temperature, or using the detail option, sending to ChatGPT and its system message, no hallucination on clear input:

Untitled

1 Like