Fine tuned model produces responses that make it seem like it hasn't been fine tuned at all

Recently I fine-tuned a model (gpt-3.5-turbo) with exactly 300 samples of training data, with very specific and repetitive patterns in it which, if trained properly, would produce the results I would like 100% of the time. In fact, I already trained a differrent custom model before with less samples of the same dataset (~200 samples) and that one works just how I want it to (aside from some weird occurrences, hence why I’m training a new model with more samples). I added the remaining 300 samples on the previous dataset later on. But the new custom model just produces very basic responses, ones that the regular gpt-3.5-turbo would clearly also produce.

It’s not returning any errors, and yes, I’m using the correct model ID in my API requests. It even does the exact same thing on Playground. Another weird thing is that the new model is also noticeably slower (I’m talking 15 to 30 seconds slower) than the old model. I really can’t wrap my head around why this is happening.

One theory I have is that I got rid of every system query in the sample data (to an empty string), since I thought it would save me tokens, and the system prompt wasn’t very significant other than “you are an AI assistant called …”. But that still wouldn’t explain the speed decrease.

That will do it.

The only thing you can do to utilize your fine-tuned model now is replicate the way you trained it, by also using a null system message in hopes of activating it the same way on completely matching inputs.