My GPT is not reading images well enough

olivierB · July 25, 2024, 9:37am

Hi there,

Disclaimer before I start: I’m not an engineer!

I’m trying to create a GPT that could read an image and count the number of adhesive tape pieces there’s on the back of a bath shelf.

I have input him a data base of 20 images with the number of tape pieces for the GPT to learn.

It works ok when the tape pieces are clearly apart.

But not so well when they are closer (exemple below)

In this example the GPT counted 4 tape pieces probably thinking the upper and lower tape pieces count for one although there are two pieces.

Any idea on how I could improve the GPT performance? Some guidances on an efficient prompt maybe?

thanks!

Bolded · July 25, 2024, 1:08pm

Did you tried to teach it with that image above that has 2 one next to another?

olivierB · July 25, 2024, 1:58pm

For your information this is a section of the database I fed it with (second column teaches the GPT the number of pieces of tape it should identify) :

So to your question, yes, in row 4 this is a section of the whole picture, and it was still not able to identify it as two pieces of tape.

chieffy99 · August 13, 2024, 6:30am

Even though the model is capable of distinguishing objects well, it doesn’t actually see them physically as humans do. Even humans, who physically see, might perceive something as four pieces if they haven’t read what you wrote or don’t have sharp vision. Generally, the model operates based on its fundamental capabilities unless given specific instructions.

The term ‘observation’ for humans involves a single method of seeing combined with cognitive processing of what is seen. However, the model uses different methods of ‘seeing,’ such as edge detection, and then processes what it perceives.

You might add instructions to guide responses in various situations, compensating for the lack of specific training data input.

Current models like ChatGPT are intelligent enough to interpret words as actions that compensate for necessary information, but they are not smart enough to independently determine what to do to correct or solve an issue.

Topic		Replies	Views
GPT4-vision model counting problem API	1	659	March 20, 2024
Improve image processing with number "1" and "7" API gpt-4-vision	6	199	November 19, 2024
Challenges with GPTs Image Classification: Seeking Solutions Prompting chatgpt , gpts	2	1104	December 16, 2023
Image recognition: looking for advice Prompting gpt-4 , chatgpt , image-reading , chat-with-images	0	823	March 1, 2024
How to patch up GPT-4V for image interpretation/reasoning applications GPT builders gpt-4	3	907	February 2, 2024

My GPT is not reading images well enough

Related topics