We've added support for vision fine-tuning

anon22939549 · October 3, 2024, 12:30am

A post was split to a new topic: Image fine tuning, false positive content policy violation

MrOrange · October 2, 2024, 11:56pm

For fine-tuning for object detection. What’s the expected/ideal format for preparing the annotations.

willhang · October 4, 2024, 4:45pm

5 posts were merged into an existing topic: Image fine tuning, false positive content policy violation

willhang · October 10, 2024, 12:47am

There isn’t really, it just depends on what your image encoder was trained on. In our case, it appears to be RGB pixels, and whatever interpolation or resizing you do is up to you. We do perform cropping on our end (the tiles), but it’s just image patches at the end of the day. Generally higher resolution at a high fovea setting will yield better performance.

willhang · October 10, 2024, 12:48am

We’ve seen a variety of formats work! Some folks use JSON, others make the model describe what’s in the scene. Are you looking for just object detection/counting or bounding box prediction as well?

MrOrange · October 10, 2024, 1:57am

I’m looking for the expected/ideal format for bounding box predictions for multi-object/class scenarios

willhang · October 10, 2024, 5:42pm

Got it, generally people just use the regular JSON format, like a list of

{
    x: 123,
    y: 456,
    class: horse
}

or tuples like (123, 456, horse). It’s up to you!

To be honest, our models aren’t great at spatial reasoning, although vision fine-tuning has yielded dramatic improvements. Hopefully your use case will be one such case!

alvaro.lasernalopez · October 28, 2024, 9:32am

Hey everyone, I’ve been finetuning 4o with images, but I ran into the issue that a lot of the images I use get filtered, eventhough, when manually checking them, they dont go against any of the policies. I am now using the moderation api, but it is strange that the moderation categories dont mention any of the ones in the docs: Faces, people, children or captchas. From the output I get only this:

CategoryScores(harassment=0.0, harassment_threatening=0.0, hate=0.0, hate_threatening=0.0, illicit=0.0, illicit_violent=0.0, self_harm=1.1235328063870752e-05, self_harm_instructions=5.093705003229987e-07, self_harm_intent=1.8925148246037342e-06, sexual=8.481104172358076e-06, sexual_minors=0.0, violence=0.010131805403428543, violence_graphic=1.3552078562406772e-05, harassment/threatening=0.0, hate/threatening=0.0, illicit/violent=0.0, self-harm/intent=1.8925148246037342e-06, self-harm/instructions=5.093705003229987e-07, self-harm=1.1235328063870752e-05, sexual/minors=0.0, violence/graphic=1.3552078562406772e-05)

I wanted to know if the moderation API has to be updated, or if the answers to the moderation are already emmbedded within some of those fields. Thanks!

s.akamatsu · December 22, 2024, 11:34pm

Hi Everyone,

I attempted vision fine-tuning using some fictional manga illustrations (featuring characters that do not exist in reality), but I encountered the following error:

“Training file *** contains 13 examples with images that were skipped for the following reasons: contains faces, contains people. These examples will not be used for training. Please visit our docs to learn how to resolve these issues.”

How can I address this issue?

Topic		Replies	Views
Image fine tuning, false positive content policy violation API fine-tuning , content-policy , fine-tuning-vision	49	522	December 23, 2024
Vision Finetuning failure: Too many images were skipped due to moderation API fine-tuning-problems	3	109	October 2, 2024
Vision Fine Tuning - Great News but I'm still unable to upload images to the Fine tuned model? API fine-tuning-vision	4	88	October 4, 2024
Issues with Fine-Tuning GPT-4o Model for Image Support and Billing Errors API api , fine-tuning-problems , assistants-api	3	219	October 10, 2024
Fine-tuning gpt-4o on image data API fine-tuning , fine-tune	9	679	November 29, 2024

We've added support for vision fine-tuning

Related topics