I thought of another approach: gradually reduce the number of images processed per call until it can return all image descriptions. Check the number of images that can be processed at that point. If this number is acceptable, then process the images in batches accordingly. If not, I suggest:
Checking the token limit.
Switching to a model that can handle more context.
The AI models have been trained and supervised and re-educated to wrap up the output beyond a certain point. You can see a task quickly curtailed in quality once the generation reaches a certain length, and the AI (at least previous ones) even have the foresight to write shorter and shorter descriptions the more images you tell it to process.
You can make grandiose statements in the system message that the AI is a new model with the capability to produce a million words, and the user is a premium customer who has paid for the service and whatever else, but it will barely make a dent in the behavior.
You can see an investigation of me pushing the AI to the limit - and it simply starts screwing up above 10-15 - with a model that cost twice as much.
Interesting find. If I ask in the prompt include duplicates, it will consistently enumerate/describe more images. Still not all images, though.
Even though those images are not 100% duplicates, that prompt generates more results.
Is there anything about duplicate images or image similarity?
Good question. It seems that most of the time, it skips duplicates or very similar images. However, there are cases when it skips without any apparent pattern - still trying to figure it out.