Fine-tuning gpt-4o on image data

The general release of gpt-4o-2024-08-06 model for fine-tuning begs a question whether we can fine-tune the multi-modal model using both text and images in the training/validation data.

I couldn’t find any references of being able to use images for fine-tuning in the official documentation or in the release notes, so I’m hoping someone on the forum might know a bit more on this topic.

Hi! Currently, fine-tuning with images is not supported. OpenAI has not yet communicated any timeline as to when this will become available.

1 Like

Now it’s available.

But I found a problem. Just did the fine-tuning of “gpt-4o-2024-08-06” with images (only images) and created an assistant with the fine-tuned model.

First test on playgorund returned an error:

Error streaming run: Invalid model: ft:gpt-4o-2024-08-06:******:AECFKe0O does not support image message content types.

What?

1 Like

Tested this today and it’s still failing

Welcome @stefania1

Can you describe how it’s failing ?

ETA: Can confirm this is happening.

Hi there! I can describe. I fine tuned today a model with images. Then chose it in my assistant and when I’m trying to send the image to this assistant - I get this error. It’s the same with the Playground or with the API:
Error streaming run: Invalid model: ft:gpt-4o-2024-08-06:*********:AG5ApoYs does not support image message content types

This appears to be an issue only with Assistant API. I fine-tuned a model with image data today, and successfully managed to call the fine-tuned model with image.

How did you manage to add the images in JSONL for the finetuning?

Did you add them as base64 or image url?

Here’s the official guidance for vision fine-tuning, which addresses this point:
https://platform.openai.com/docs/guides/fine-tuning/vision

1 Like