Impact of Fine-Tuning GPT-4o on Multi-Modal Capabilities

If I fine-tune a GPT-4o model on specific text examples, will I still be able to pass images to the model for inference? Also, will the fine-tuning on text examples impact the model’s performance with images?

Welcome to the community!

I believe so if the model is multi modal.

I do not believe so. I’ve heard fine-tuning for images might be coming eventually, though.

Hey @bangabanga have you ever successfully used a fine-tuned model for image prompts? I’m currently unable to get a response from a fine-tuned model when I input an image.