I just tried sending an image to a -mini
fine-tuning. Received: Error: this model does not support vision.
This may be that the use of images against a fine-tuning that had no images gives no opportunity for filtering or predicting. You make a model that only says “hot” or “not hot”, it’s not going to have a fine-tune moderation error until you include the preceding pictures.
Also, the normal learning rates without modified hyperparameters quickly make an over-specialized AI.