Gpt-4-vision-preview vs web chat results and configuration

I’m trying to use gpt-4-vision-preview for make-up coaching, given an image of the user as an input. That all start with an analysis of some (safe and identification free) facial features.

I’m getting excellent results in ChatGPT Pro Web, but for the exact same prompt and same images, gpt-4-vision-preview performs dramatically worst in terms of analysis.

We are getting a near 90% accuracy on the web version and roughly 50% with the same images dataset and prompt with the API.

I couldn’t find much information on what configuration and model are used in the web version when submitting a prompt with an image attached. Is it the same gpt-4-vision-preview that’s available in the platform API ?
Are the parameters used known / documented, or is there any reliable estimate from the community ? (temperature, etc.)

Thanks in advance for any hint

4 Likes

Hi ! I’m having the exact same issue. For the same image and same prompt, the description’s accuracy of the API (vision-preview) and the ChatGPT web version are dramatically different. Please help !

2 Likes

Same problem. RANDOM answers by gpt-4-vision-preview for insight recognition.
The chat is working perfectly!

1 Like

Any luck on this? I’m having similar issues. In ChatGPT, when I ask for a description, it’s spot on, including location details etc. What I use the API with vision-preview, the API says something like: “this looks like a square somewhere in Europe”. I have experimented with temperatures, top_p, but nothing gets close.

When I ask more specific questions like “where is this?” then the API does give me a proper answer, but a question like this is too specific for my use case.

Hi! Did you figure out a solution for this?