Vision on fine-tuned models

Hi!
I’ve created a fine-tuned model based on gpt-4o-mini and when i try to upload an image it reports one of the following errors depending on type of image (upload/url):

Error: this model does not support vision.
Or
Invalid image URL: ‘messages[6].content[0].image_url.url’. Expected a base64-encoded data URL with an image MIME type (e.g. ‘data:image/png;base64,aW1nIGJ5dGVzIGhlcmU=’), but got a value without the ‘data:’ prefix.

Fine-tuned models do not have vision capabilities.

Welcome @mazur.stas

Can you share the code for the API call that leads to this error?

@sps I got these errors in playground UI

Can you test with the following code:

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
  model="your gpt-4o-mini fine-tuned model here",
  messages=[
    {
      "role": "user",
      "content": [
        {"type": "text", "text": "What’s in this image?"},
        {
          "type": "image_url",
          "image_url": {
            "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
          },
        },
      ],
    }
  ],
  max_tokens=300,
)

print(response.choices[0])

I got error with following message:

Invalid image URL: ‘messages[0].content[1].image_url.url’. Expected a base64-encoded data URL with an image MIME type (e.g. ‘data:image/png;base64,aW1nIGJ5dGVzIGhlcmU=’), but got a value without the ‘data:’ prefix.

@anon22939549 is correct.

It looks like fine-tuned gpt-4o-mini might not support vision capability.

I need to fine-tuned model that can understand images and texts together. Are there any models that support vision and fine-tuning? Or any workarounds? Thanks for any suggestions.