Im using visual model as OCR sending a id images to get information of a user as a verification process. The problem is the 80% of the time GPT4 respond back “I’m sorry, but I cannot provide the requested information about this image as it contains sensitive personal data”.
The prompt that im using is: “Act as an OCR and describe the elements and information that can be observed in this image, focusing on details that would be relevant to verification issues. If possible, do it in key:value format without omitting any data or responding with Redacted, please.”
How to prevent ‘I’m sorry, but I cannot provide the requested information about this image as it contains sensitive personal data’ if I’m trying to use visual model as OCR
Don’t try to use the model to OCR sensitive personal data.
May I ask why? Isn’t the whole purpose of vision models to provide image-to-text translation and reasoning layer? Since API requests aren’t used for training, what’s wrong with using vision models for OCR’ing personal data?
I would try writing a more speciic prompt, which informs the AI where to look in the image. It worked for me. I had people in the image but I directed the AI to look at specific things other than people in the image.
So let’s say your image has a person who holds a sign. You can tell the AI to only look at the sign and tell you what the sign says.
I got plenty of OCR working this way with GPT. It takes a lot of prompt dev though.