I just fine-tuned the 3.5-turbo-0613 yesterday, and see great improvement when I used the fine-tuned model on the playground and call from my local environment. But when I deployed it into our live production, it didn’t perform at all. The reason I fine-tuned is to give it a different tone to make it speak more concisely and orally which I can’t achieve by prompt engineering. So back to the question, has anyone met the same issue? How did you resolve it? Many thanks in advance!
Are you giving your AI the same system prompt as your fine tuning in API also?
Have you made chatbot software that continues to feed previous conversation back as user/assistant roles so the AI knows what you were previously talking about?
Thanks for your reply! Yes I gave the same the system prompt as my fine tuning in API. And I did feed the previous 6 conversations so AI knows what I previously talked about.
Are you getting any evidence that your fine-tune model is actually used? Forgetting to specify the model or it being used wrong could give an AI that only follows the system prompt. The full API response includes the model used.
Replicating all playground parameters at temperature 0.1 and top_p 0.1 should get you nearly identical responses. “Generate code” in playground gives a python example with the parameters.
And if it wasn’t clear: you must make a system role message that is permanently inserted as the first to replicate the “system” box of the playground, and will invoke the behaviour you have tuned.
Yes the difference of tone of reply to the same question between playgroud/local and production is very obvious so I think it’s evident enough.
I can also ensure all the parameters are the same. System message is also the same.
Feel free to post some code, and we can take a look.
Examples are helpful to root out possible problems…
Thank you everyone! By looking at the log I realized there is some extra message attached to the system prompt
Sorry to take your time answering. I will be more careful next time!