Image recognition: looking for advice

I’m trying to processing pictures using visual recognition and creating titles, descriptions and keywords according my particular needs, generating a resulting CSV
. And I have to processing multiple pictures. So, I created a custom GPT.

Here is the problem.

I managed to fine-tune my prompt and supplied GPT with additional info, donut works great … till I sending one picture and asking to process it. The results are very satisfying already.

But when I’m trying to process a batch of pictures, everything going south.

First of all, after the first picture, each next processing progressively worse, especially regarding keywords: GPT creating less and less keywords for each next picture.

Then, process often ending with an error and GPT asking to “regenerate” response.

I tried different approaches, like:

  • upload pictures one by one (which is painful). It works for a couple of pictures, then starting to slowing down, worsening results, and died with an error.
  • upload a batch of 10 (maximum allowed) or less pictures, with a particular instructions to processing first one, then after corrections go to the rest; when they are done, I upload the next batch, etc. ooooppps, no “etc” - in rare occasions I was able to process the first batch but never finished the second. GPT starting to skipping pictures, reducing quality of response, and eventually crashing.
  • I tried to upload all pictures (like 20-30) in multiple batches, then start processing of all uploaded, or one-by-one - the result is the same, eventual crash.

Any advice - how to set up the GPT to process multiple pictures following the instructions, without crashes and lowering quality of his work?