Excessive talkativeness of the tuning neural network

Hi all!
Help is needed!
I’m experimenting with fine-tuning and can’t solve the fine-tuned neural network response problem.
I have 300 questions / answers, I trained all available neural networks on them: davinci, curie, babbage, ada. In the dataset, for the end of “prompt” I used ->, for the end of “completion” the symbol \n.
I played with hyperparameters, but did not solve the problem: When I check the performance of the playground, then all my neural networks generate very long answers. That is, the first sentence is a normal answer, and then nonsense. Writes a question from the dataset and answers it. Often duplicates questions and answers three times. This happens with all four finely tuned neural networks.
Tried to use Stop sequences, but it does not give any result at all.
I would be very grateful if you tell me what I’m doing wrong?

Thank you J!
Let me clarify.
That is, make changes to the dataset and at the end of each “prompt”, instead of → insert “\n\n\n\nAI:”, and at the end of each “completion” replace \n with “\n\nUser:”?

Yes. That looks good. The only caveat is that the separator should be unique, so you shouldn’t train on a multi-turn conversation string of “user:” to show it how to converse at length using context and still use the same separator — separator need to be unique to all text. OpenAI doesn’t really detail how they do this themselves.

The AI then will try to keep on simulating a conversation between a user and an AI, but you have a very detectable generation stop phrase to specify in the API call, a stopping point where you’d need to provide the next question yourself.

Yes, you are right, openai do not describe this moment in great detail, I re-read all the documentation and did not come to a decision :frowning:
Can you elaborate on your caveat:“The only caveat is that the separator should be unique, so you shouldn’t train on a multi-turn conversation string of “user:” to show it how to converse at length using context and still use the same separator — separator need to be unique to all text. "
What does “multi-turn conversation string of “user:”” mean?

I knew that would be hard language to parse. Imagine a typical chat (which may or may not be why you decided to fine-tune your own model.

Human: Who’s the best basketball player?
AI: Michael Jordan: Widely regarded as the greatest basketball player of all time, Michael Jordan’s list of accomplishments is unparalleled. With six NBA championships, five MVP awards, and numerous scoring titles, he dominated the game during his era and left an indelible impact on the sport.
Human: Who would be #2?
AI: That is harder to determine, but I would say Kareem Abdul-Jabbar: Holding the record for the most points scored in NBA history, Kareem Abdul-Jabbar’s skyhook shot and dominance in the paint made him a legend of the game. With six NBA championships and six MVP awards, he remains one of the most decorated players in basketball history.
Human::

Now we see that the second question is one that requires contextual clues. Maybe we should train the AI how to answer questions like this, by also providing this type of full conversation as part of our fine-tune?

However, that conversation would have multiple instances of “AI” being produced, meaning that using it in the typical way, we can’t have it be a separator for training. We have to write a bit differently, including more characters in the separator or something different about prior “AI” entries, while not confusing the AI into unexpected production.

I agree, it sounds complicated :slight_smile:
But maybe we just stop the neural network after the first question of the user and, accordingly, its answer (this is exactly what I’m trying to do), and with an additional user question, send this data (question / answer) as a context?
Accordingly, with the third question of the user, the two previous pairs of question / answer will fly as a context to the third question?

Yes, without the massive training of a model like text-davinci-003, the completion models will train themselves on their own output very quickly.

This can be good or bad.

You ask the color of a quince.
You ask the color of a grape.
You ask for a guide in employing the tenets of Rogerian Reflection in a client-centered therapeutic setting and its merits over alternatives.

1 Like