Image analysis takes too long for lot of promts

sebasviollaz · June 19, 2024, 3:19pm

Hi, i have been working with CHAT-GPT API a while. I’m working in a project that analize images based on a request of that images.
I have a few images that i send and a prompt about this images.

Something like this (All the images correspond to the same person):
IMAGE OF APERSON
IMAGE OF A PERSON
IMAGE OF A PERSON
IMAGE OF A PERSON
Bases on this images tell me if in the person has a helmet on it.

The problem is that based on the same images i have a few request so for the moment i’m doing something like this:

IMAGE
IMAGE
IMAGE
Promp1
CHAT RESPONSE

IMAGE
IMAGE
IMAGE
Promt2
CHAT RESPONSE

IMAGE
IMAGE
IMAGE
Prompt3
CHAT RESPONSE

The problem with this is that it takes a lot of time because it evalas each image once for each prompt. I have a lot of images and a lot of prompts for each imagesand it takes too long.

I though in making something like this:
IMAGE
IMAGE
IMAGE
Base prompt: Based on this images answer this questions
Prompt 1
CHAT RESPONSE

Promt 2
CHAT RESPONSE

Promp 3
CHAT RESPONSE

This will save a lot of time, I need to manage the responses individually so i expect a chat response after each prompt is given.

Is there any way that the AI remembers the previews images without having to send them again?

If you have any other solution to my problem i will be appreciate.

Thanks, let me known

PaulBellow · June 19, 2024, 3:29pm

Welcome to the forum!

No, I do not believe so.

What you might do is translate the image to text then save that for the series? Still tokens, but it might be less tokens… and speed it up a bit?

sebasviollaz · June 19, 2024, 3:39pm

What do you mean by “traslate the image to text”.?
Like making a description of the image?

PaulBellow · June 19, 2024, 3:41pm

Yeah, because that’s what you’re doing every time you’re sending them… it “looks” and then describes… so… I dunno. Might be worth a try?

sebasviollaz · June 19, 2024, 3:48pm

It wont work, the program that i’m working is a maintenance module.
So there are a lot of details in the images that have to be analyzed, for example,
If the images are of a pool These are the details,
Check if the water is clean,
Check if there is water in the pool,
Check if there are people in the pool,
And any other questions based on that images.

The “equipment” in the images could be anything so is very difficult to describe the image completly. I will lose a lot of deatils

PaulBellow · June 19, 2024, 3:52pm

Hrm. Good point. I see what you mean now.

I dunno. Might have to limit it to a single image at a time or something?

The price will be coming down eventually too… and context will increase too… so maybe get the MVP working then scale it up as the tech advances?

sebasviollaz · June 19, 2024, 4:00pm

The idea is that the images are from diferent angles of the thing that you want to ask so limit to only ine image could lead to error in the analisys.
Maybe i can wait but the most important is to reduce the time execution, maybe i can search for a way to make like request in different threads or something

icdev2dev · June 19, 2024, 4:15pm

If your only considerstion is time of execution and the time taken to upload the images is where the bottleneck is, then something like below will take the images just once and ask the questions of the images in a sequential way.

The key idea is storing the responses in seperate thread.

Create Thread1
Create Thread2

Thread1:

Image1
Image2
Image3

Prompt1: is there water in pool?
Response1: Yes

THEN take the last two messages and store them in Thread 2. Delete the last two messages from Thread1.

Prompt1: is the water in pool clean?
Response1: Yes

THEN take the last two messages and store them in Thread 2. Delete the last two messages from Thread1.

Rinse and Repeat

sebasviollaz · June 19, 2024, 6:28pm

Im not sure how that can help,
How the ia can response a questions withou havien to send the images list again?
If i send a list of images and a prompt CHATGPT will give me the response, but if in the next prompt i ask something aboout the images the chat wont remember
the images

PaulBellow · June 19, 2024, 6:47pm

The idea we’re trying to get across is that you would build up a “knowledgebase” of sorts about the images by running them multiple times in advance and storing all that “knowledge” about them in text form.

Not optimal, of course, but I’m kinda curious now as to how well it would work!

sebasviollaz · June 26, 2024, 2:02pm

Thanks, this is usefull. BUt this isn’t the problem that im having.

Let me explain better and change the question.
The main problem is the time that takes to eval the images after each prompt is asked.
No the upload time.

At the beginning i though that if GPT remeber the image will take less time to eval the given prompt but the IA doest work that way.

This was my solution:

Make a complex prompt with all the question and the spliting the gpt answer to get the responses.

Example:

Image
image
IMage
IMage

Given this pool images answer this questions:

The water is cleam?
2.The pool is full of water?
Is anyone swiming?

GPT Answer.

No
YEs
No

So this solution was the best to my case becaouse the IA evals the image once, (im sending only one prompt with all the questions)

Topic		Replies	Views
How to compare 2 image simialrity using OPenAI api API gpt-4 , api	17	26280	October 8, 2024
Image generation take longer than before.... why? API dalle3	8	11226	February 26, 2024
Integrate both gpt-4 and gpt-4 vision in same chat API gpt-4 , api , gpt-4-vision	4	1453	February 26, 2024
Question About Speed Of GPT4-Turbo W/Images API gpt-4 , api	5	1371	March 8, 2024
Get consistency in responses across different API calls to ChatGPT API	10	1431	July 20, 2024

Image analysis takes too long for lot of promts

Related topics