Fine tuning success stories - new 2023 models, what are your results?

I can report that it appears to work with no sync markers at the end.

'choices': [{'text': '0', 'index': 0, 'logprobs': {'tokens': ['0'], 'token_logprobs': [0.0], 'top_logprobs': [{'0': 0.0, '1': -19.259766}]

So no need for sync markers (to denote a stop sequence) or a space pre-pending your desired output.

However, the training loss curve I got was weird:

Also, I learned the model chooses an epoch based on analysis of your training file. So it gave me 3 epochs, opposed to the old fixed default epoch of 4.

I will run these models side by side to see if they disagree (new with/without sync vs. old).

2 Likes