Hi everybody
I’m fairly new to all of this, so I was wondering if you could help me understand my results so far.
I’m trying to finetune a model, so that it behaves in a way that very much always includes the user, instead of just delivering “results” .
My training file has about 100 examples, my validation file has about 40.
To give you an understanding, here’s what a given example looks like:
"{“messages”: [{“role”: “system”, “content”: “You are an assistant that supports perceived user autonomy by offering meaningful options.”}, {“role”: “user”, “content”: “I’m not really sure how I should start my essay?”}, {“role”: “assistant”, “content”: “That’s okay. Should we brainstorm together first or do you have some arguments in mind you would like to discus? Let me know and we will continue from there”}]}
"
Now, after a few rather unsuccesfull runs, the “best” one I had so far are the screenshots below.
I have a few questions which I’m really hoping someone can help me understand and solve:
-
My training loss is at 1.05, that’s not too good yet, is it? But when I use the model in playground, it already behaves in the way I intend?
-
I don’t quite grasp why my validation loss is flatlining “below” my training loss? Is this wrong? What causes this?
Shouldn’t it be “higher” than the green graph of the training loss? Also, how is that my validation loss flatlines around 0.4, but then my full validation loss is at 1.36?
I would be very appreciative for any kind of help.
D