Image analysis takes too long for lot of promts

Hi, i have been working with CHAT-GPT API a while. I’m working in a project that analize images based on a request of that images.
I have a few images that i send and a prompt about this images.

Something like this (All the images correspond to the same person):
IMAGE OF APERSON
IMAGE OF A PERSON
IMAGE OF A PERSON
IMAGE OF A PERSON
Bases on this images tell me if in the person has a helmet on it.

The problem is that based on the same images i have a few request so for the moment i’m doing something like this:

IMAGE
IMAGE
IMAGE
Promp1
CHAT RESPONSE

IMAGE
IMAGE
IMAGE
Promt2
CHAT RESPONSE

IMAGE
IMAGE
IMAGE
Prompt3
CHAT RESPONSE

The problem with this is that it takes a lot of time because it evalas each image once for each prompt. I have a lot of images and a lot of prompts for each imagesand it takes too long.

I though in making something like this:
IMAGE
IMAGE
IMAGE
Base prompt: Based on this images answer this questions
Prompt 1
CHAT RESPONSE

Promt 2
CHAT RESPONSE

Promp 3
CHAT RESPONSE

This will save a lot of time, I need to manage the responses individually so i expect a chat response after each prompt is given.

Is there any way that the AI remembers the previews images without having to send them again?

If you have any other solution to my problem i will be appreciate.

Thanks, let me known

1 Like

Welcome to the forum!

No, I do not believe so.

What you might do is translate the image to text then save that for the series? Still tokens, but it might be less tokens… and speed it up a bit?

What do you mean by “traslate the image to text”.?
Like making a description of the image?

Yeah, because that’s what you’re doing every time you’re sending them… it “looks” and then describes… so… I dunno. Might be worth a try?

It wont work, the program that i’m working is a maintenance module.
So there are a lot of details in the images that have to be analyzed, for example,
If the images are of a pool These are the details,
Check if the water is clean,
Check if there is water in the pool,
Check if there are people in the pool,
And any other questions based on that images.

The “equipment” in the images could be anything so is very difficult to describe the image completly. I will lose a lot of deatils

1 Like

Hrm. Good point. I see what you mean now.

I dunno. Might have to limit it to a single image at a time or something?

The price will be coming down eventually too… and context will increase too… so maybe get the MVP working then scale it up as the tech advances?

The idea is that the images are from diferent angles of the thing that you want to ask so limit to only ine image could lead to error in the analisys.
Maybe i can wait but the most important is to reduce the time execution, maybe i can search for a way to make like request in different threads or something

1 Like

If your only considerstion is time of execution and the time taken to upload the images is where the bottleneck is, then something like below will take the images just once and ask the questions of the images in a sequential way.

The key idea is storing the responses in seperate thread.

Create Thread1
Create Thread2

Thread1:

Image1
Image2
Image3

Prompt1: is there water in pool?
Response1: Yes

THEN take the last two messages and store them in Thread 2. Delete the last two messages from Thread1.

Prompt1: is the water in pool clean?
Response1: Yes

THEN take the last two messages and store them in Thread 2. Delete the last two messages from Thread1.

Rinse and Repeat

1 Like

Im not sure how that can help,
How the ia can response a questions withou havien to send the images list again?
If i send a list of images and a prompt CHATGPT will give me the response, but if in the next prompt i ask something aboout the images the chat wont remember
the images

The idea we’re trying to get across is that you would build up a “knowledgebase” of sorts about the images by running them multiple times in advance and storing all that “knowledge” about them in text form.

Not optimal, of course, but I’m kinda curious now as to how well it would work!

Thanks, this is usefull. BUt this isn’t the problem that im having.

Let me explain better and change the question.
The main problem is the time that takes to eval the images after each prompt is asked.
No the upload time.

At the beginning i though that if GPT remeber the image will take less time to eval the given prompt but the IA doest work that way.

This was my solution:

Make a complex prompt with all the question and the spliting the gpt answer to get the responses.

Example:

Image
image
IMage
IMage

Given this pool images answer this questions:

  1. The water is cleam?
    2.The pool is full of water?
  2. Is anyone swiming?

GPT Answer.

  1. No
  2. YEs
  3. No

So this solution was the best to my case becaouse the IA evals the image once, (im sending only one prompt with all the questions)

1 Like