GPT4O finetuning with vision capabilities

dejia · July 23, 2024, 8:59pm

Hi community,

Does anyone know how to finetune the vision capabilities of gpt-4o? It seems the training data only accepts .jsonl files. Where shoud we put image data?

Thanks!

grandell1234 · July 23, 2024, 9:00pm

You can not currently fine-tune the image properties.

hassaan.ahmed · July 24, 2024, 2:04pm

not even if you encode the image into a base_64 string and use the in the content along with the file type like: “role”: “user”,
“content”: [
{
“type”: “image_url”,
“image_url”: {“url”: img_dat},
},
{
“type”: “text”,
“text”: “can you turn this invoice into a csv file?”,
},
], where img_dat is the base_64 str

Topic		Replies	Views
Fine-tuning gpt-4o-2024-08-06 with images? API fine-tuning	2	1312	October 3, 2024
Fine-tuning gpt-4o on image data API fine-tuning , fine-tune	9	968	November 29, 2024
Multimodal (image) fine tuning with GPT-4 API gpt-4 , fine-tuning	17	7076	October 3, 2024
Fine Tuning for Vision models API gpt-4 , fine-tuning , fine-tuning-problems	2	119	March 3, 2025
Fine-tuning GPT-4o Mini with image data API gpt-4o-mini	3	1578	August 20, 2024

GPT4O finetuning with vision capabilities

Related topics