Inconsistent Fine-Tuning Results: GPT-4o vs. GPT-4o-Mini

jr.2509 · September 19, 2024, 6:46pm

Where I am coming from is that I am trying to rule common issues that can normally impede the functioning of the fine-tuned model. While we should consider the impact oft the learning rate multiplier, I am trying to first eliminate other potential issues.

If you have not set a temperature when using the fine-tuned model and the default value is used, then likely this is not a factor that contributes to the problem. However, to be on the safe side, you might want to consider explicitly setting the temperature with a low value (zero or close to zero).

The question regarding the user / system message was to ensure that you are including the same instructions (whether placed in a system or user message) that you used in your training data when using the fine-tuned model. Naturally, the code would be different but the instructions should be consistent. Stated differently, is there anything you do differently when you use the model compared to what was included in the training data?

What surprises me to hear is that the output format varies when you use your model. That should normally not be the case and I am trying to understand what could be causing this.

Finally, what was your training loss?

Topic		Replies	Views
Fine-tune 4o model - endless inference for JSON Bugs	6	117	April 18, 2025
Json format causes infinite "\n \n \n \n" in response API gpt-4 , api , json-mode	21	9295	April 30, 2025
GPT-4o API giving wild responses Feedback gpt-4o	8	3818	June 10, 2024
Structured Outputs not reliable with GPT-4o-mini and GPT-4o API structured-output	38	7017	January 23, 2025
Fine-tuning and nonsensical JSON output (tons of extra keys) Bugs gpt-4 , chatgpt , fine-tuning , api	1	84	November 9, 2024

Inconsistent Fine-Tuning Results: GPT-4o vs. GPT-4o-Mini

Related topics