I am trying to extract complex data from images. Normally, I include examples in my API request, but that takes a lot of tokens.
I am now trying to fine-tune the gpt-4o-2024-08-06, giving examples of how to answer those images:
jsonl file:
{“messages”: [{“role”: “system”, “content”: “My explenation”}, {“role”: “user”, “content”: [{“type”: “image_url”, “image_url”: {“url”: “data:image/jpg;base64, base64_string”}}, {“type”: “text”, “text”: “Do that with this picture”}]}, {“role”: “assistant”, “content”: “Correct answer”}]}
{“messages”: [{“role”: “system”, “content”: “My explenation”}, {“role”: “user”, “content”: [{“type”: “image_url”, “image_url”: {“url”: “data:image/jpg;base64, base64_string”}}, {“type”: “text”, “text”: “Do that with this picture”}]}, {“role”: “assistant”, “content”: “Correct answer”}]}
But I get this error: “The job failed due to an invalid training file. Invalid file format. Please remove all mentions of ‘image_url’ from your file and try again.”
Is it even possible to fine-tune a model with images? What am I doing wrong here?