Feeding multiple videos in GPT-4o

anuar.yeraliyev · August 14, 2024, 3:00pm

Hi,

I want to feed 2-4 videos into the same prompt / request to GPT-4o. The videos are short 3-10 seconds each containing 5-20 frames, so it does fit in the context window.
My questions are:

Can I feed multiple videos such that the model understand they are different? (one hack is to overlay text and add skip frames between videos as one contiguous sequence of frames but that’s a little hacky)
How to align the transcriptions text with each video by frame or by second?

There is example in the cookbook just takes a single video as a sequence of frames named “introduction_to_gpt4o”, but I want to feed multiple.

Topic		Replies	Views
Does GPT-4o API Natively Support Video Input like Gemini 1.5? API api , gpt-4o	0	1042	May 29, 2024
Can GPT 4o mini model understand multiple images? API	2	358	September 18, 2024
Images input order with gpt-4 vision/omni API gpt-4-vision , gpt-4o	0	883	May 20, 2024
GPT-4 Vision Pixel Limitations API gpt-4	4	3346	March 26, 2024
Unable to pass multiple images through openai api to GPT-4o API pdf , image-reading , gpt-4o	1	1176	August 2, 2024

Feeding multiple videos in GPT-4o

Related topics