Assistant struggling reading image, whilst chat completion is right 99% of the time

aaron.murphy · May 30, 2024, 10:29am

Hi all!

A bit of back story…

I’m currently coding up a solution that will allow me to upload an image of a paper document with the data I am expecting to see on it. The idea is that GPT will be able to tell me whether the data matches or not. I’m having success so far with gpt-4o and chat completions. I’m defining the system topic and the function it uses in my code and as I said it’s working quite well. I then came across Assistants which I’d never seen before. I did some reading up on it and created one in the assistants playground. I quickly realised I could define my system topic and functions here, meaning I wouldn’t need to do it in the code. Unfortunately, upon testing the assistant I created, it is totally unable to read the most basic information from the documents I upload.

So for comparison, if I start a new chat in the playground (not assistant), upload a document and send a text prompt saying “What is the name on the document?”. I get the right answer every time. Whereas if I start a new chat with the assistant, upload the document and then ask the same question. It gives me different names every time that aren’t remotely close. I also tried creating a new assistant with no real context to see if that made a difference - it didn’t.

So, I guess my question is, why is the default chat using gpt-4o able to read the name right nearly every single time. Yet a default assistant using gpt-4o can’t even get close? It seems to read the rest of the information okay, but it just totally fumbles the name every time. I’d even argue it’s clearer than most other text on the image.

Thanks!

sps · May 30, 2024, 11:30am

Hi @aaron.murphy,

Welcome to the dev forum.

I wasn’t able to reproduce this.

Image functionality is working on the assistants with gpt-4o in my recent test.

aaron.murphy · May 30, 2024, 12:04pm

Hi, thanks for responding.

It seems to read the rest of the information okay, but it just totally fumbles the name every time. I’d even argue it’s clearer than most other text on the image. Unfortunately, I can’t share the image. But when I give it to a chat playground it will read it with ease every single time. Yet the assistant doesn’t even come close. It almost looks like it’s guessing as the names are that far off.

Topic		Replies	Views
Using GPT-4o via assistants API vs ChatGPT Issue API	1	436	July 1, 2024
Information Vision Assistant API API gpt-4 , chat-with-images	2	327	May 22, 2024
Assistant with vision is giving back fake data from uploaded form images API assistants-api	0	26	March 3, 2025
Is GPT4-o dumber in Assistans API than in normal chat? API gpt-4o	3	791	September 7, 2024
Can an assistant help me with OCR? API gpt-4	7	2954	June 6, 2024

Assistant struggling reading image, whilst chat completion is right 99% of the time

Related topics