Finetuning Noob : Guidelines and Best Practices?

adithyan.i4internet · September 29, 2023, 10:39pm

So, I have been using the Chat completion endpoints (3.5-Turbo) for a while. And I have built a nice product around it. And now I have reached a point where I need to start fine tuning it to further increase the performance of my system (yes I did try with prompt engineering and single/multi-shot examples, they were not sufficient).

To get hands-on, I tried running a fine-tuning (3.5-Turbo) job with OpenAI API. I will admit I have no idea how this works. I just read and followed the guidelines in the official OpenAI documentation on how much and how to prepare the data. And did the API calls.

But now that I fine-tune it, I do see that, empirically model is significantly much better. But then I looked into my fine-tuning just out of curiosity then I found this training loss graph (image attached).

Can someone explain to me in layman term what this means? Should the training loss monotonically decrease? What are the implications of this my model? Will this performance further increase if somehow “clean-up” and have “better”/more data?

Also, I want to learn more about fine-tuning LLMs, with more empahsis on practical guidelines and best-practices, especially to build out my product. Can someone suggest useful resources?

Thanks folks!

Foxalabs · September 30, 2023, 7:40am

Traditionally a loss function shows how well the model is answering questions, i.e., is it getting the answers correct? The closer to 0 you get the more correct the model is becoming.

Typically, the loss function will get lower as more training data is added, 0 loss is however not usually a good thing, at least not traditionally, as it can (and usually is) a sign that the model is “over fitting” this means that the model is learning how to simply repeat the correct answers to any given question, like parrot repeating words. This means that the model has lost it’s ability to generalise and will perform poorly when tested on unknown new data.

However! The new loss function graphs from GPT-3.5 do not seem to be following this as before, so I would not like to make any firm comments on it until I’ve got more information to go on.

Topic		Replies	Views
Training loss=good, Validation loss=good API fine-tuning , api , fine-tuning-problems	8	4969	April 5, 2024
Help with fine-tuning, think I'm over-fitting, but not sure API fine-tuning	7	2411	December 21, 2023
Questions about fine-tuning GPT-3.5-turbo API fine-tuning	1	2161	October 29, 2023
Fine Tuning for the first time API	3	121	December 4, 2024
Poor fine-tuning results of GPT 3.5 API	3	1134	February 21, 2024

Finetuning Noob : Guidelines and Best Practices?

Related topics