Multimodal gpt-4o-mini Fine tuning

thornecanyon · July 23, 2024, 11:38pm

Are we able to do this?

for this post:
/t/multimodal-image-fine-tuning-with-gpt-4/723327/3

the answer was no for gpt-4. any change of conditions?

anon22939549 · July 24, 2024, 1:14am

Not yet. OpenAI is always looking at bringing new features to developers and my understanding is that they would like to enable multimodal fine-tuning and will when it’s ready.

iarbel · July 24, 2024, 6:22am

This is a much needed feature

shure.alpha · July 25, 2024, 3:09am

We do need this feature
Maybe this will end the era of Open-VLM
upvote!

MattFerguson · July 25, 2024, 6:51pm

Damn I was really hoping the answer would be yes! I want to try and train image to screen coords for mouse clicking

anon22939549 · July 25, 2024, 6:57pm

It sounds like you’ve got a WFH job that has implemented some kind of activity tracking and you’re looking to build an even more advanced “mouse jiggler.”

shure.alpha · July 26, 2024, 7:26am

Can we convert image into text?

hendro.wibowo · October 18, 2024, 3:11am

@shure.alpha do you mean OCR? The answer is yes

Topic		Replies	Views
Multimodal (image) fine tuning with GPT-4 API gpt-4 , fine-tuning	17	7341	October 3, 2024
Is there plans to allow for training GPT4-Vision API gpt-4-vision	1	853	November 15, 2023
Fine-tuning the gpt-4-vision-preview-model API gpt-4 , fine-tuning , gpt-4-vision	8	6509	August 26, 2024
GPT-4 API multimodal access (images) API	8	13806	July 2, 2024
Why is native image markup still a hurdle for GPT models? And is Open AI working on such capabilities? Community chatgpt	0	73	February 18, 2025

Multimodal gpt-4o-mini Fine tuning

Related topics