Fine-tuning fails due to zero examples

I am trying to fine-tune GPT-4o-2024-08-06 for image recognition. I have training jsonl file with 20 examples and validation jsonl file with 10 examples. Each example is in the required compact format according to the vision fine-tuning documentation. I am using image_url instead of uploading the images encoded in base64. Every time I create the fine-tuning job, I get the following error message: “Training file has 0 example(s), but must have at least 10 examples”.

Here is an example:
{“messages”: [{“role”: “system”, “content”: “You are an assistant that identifies objects.”}, {“role”: “user”, “content”: “What is this object?”}, {“role”: “user”, “content”: [{“type”: “image_url”, “image_url”: {“url”: “url/image01.jpg”}}]}, {“role”: “assistant”, “content”: “It’s a book”}]}

Thank you for your help.

1 Like

Welcome @dr-torres

Do the images in your dataset have any of these?

  • People
  • Faces
  • Children
  • CAPTCHAs
2 Likes

No, I manually checked. In fact, I am able to fine-tune with these images in base64, but not when I use the URL. I am using a public GitHub repo to store the dataset.

In that case it is likely that the image fetch utility is being blocked or unable to fetch images from the URLs.

2 Likes

Thank you; it will be helpful if the API returned an error code. I will try reading the dataset from a different public storage to double check.

3 Likes

I think I figured it out. I need to use raw content URL:

https://raw.githubusercontent.com/<user>/<repo>/<branch>/<filename>

otherwise OpenAI API will not be able to read the file.

I hope this is helpful to other people. :grinning: