Why does using API it doesn't describe all images?

Using API almost always the Model describes fewer images.
Some examples:

  • Out of 29, it described 22.
  • With multiple tries using the same input. Out of 15, it described only 10, 11, 13 (I never got 15).

My Prompt:

list all images with their descriptions

And in the API call, I’m providing a bunch of images.

However, using UI, it seems like it always describes all provided images.
How to get the same results using API?

For the record I always get "finish_reason":"stop".

  1. look at how many tokens the model generated. You may be asking it to produce more than around 750, and it will just give up prematurely;
  2. the abilities of this new model to do more than just chat about children’s math problems or follow large context are unproven yet.

Always the same images? Any patterns?

For the 29/22 images case, here are the usage stats:

"usage":{"prompt_tokens":32058,"completion_tokens":1604,"total_tokens":33662}

Does this mean it can’t reliably provide answers for more than 750 tokens?
Is there a way to overcome that 750 tokens limit?

I thought of another approach: gradually reduce the number of images processed per call until it can return all image descriptions. Check the number of images that can be processed at that point. If this number is acceptable, then process the images in batches accordingly. If not, I suggest:

  1. Checking the token limit.

  2. Switching to a model that can handle more context.

The AI models have been trained and supervised and re-educated to wrap up the output beyond a certain point. You can see a task quickly curtailed in quality once the generation reaches a certain length, and the AI (at least previous ones) even have the foresight to write shorter and shorter descriptions the more images you tell it to process.

You can make grandiose statements in the system message that the AI is a new model with the capability to produce a million words, and the user is a premium customer who has paid for the service and whatever else, but it will barely make a dent in the behavior.


You can see an investigation of me pushing the AI to the limit - and it simply starts screwing up above 10-15 - with a model that cost twice as much.

Thanks for all your replies, but I’m baffled here.

Using the same 29 images case and this prompt:

Enumerate all images with short titles

It enumerated only 19 images.

Here is the consumption stats:

"finish_reason":"stop"}],"usage":{"prompt_tokens":32059,"completion_tokens":160,"total_tokens":32219}

Interesting find. If I ask in the prompt include duplicates, it will consistently enumerate/describe more images. Still not all images, though.
Even though those images are not 100% duplicates, that prompt generates more results.

Is there anything about duplicate images or image similarity?

Good question. It seems that most of the time, it skips duplicates or very similar images. However, there are cases when it skips without any apparent pattern - still trying to figure it out.