Finetuned a model, but it replies like insane

Really got a lot of specific problems… :face_holding_back_tears: I finetuned a model (by Gpt-3), when I test it in the playground, it responded with long long sentences in addition with some format-like explainations, even I set the sentence limit(no more than 3 sentences) in the prompt; And it ends with an uncomplete sentence:


The green part are all generated by AI.

and I thought maybe I should add a } at the end of the prompt since the system provided a guide:

But when I put a } there, I just got this:

:rofl:

Anyone know how this happened? Is the problem of the prompt, or the training data? Or should I moderate the parameters at the right side of the page?

Thanks a lot! :heartbeat: :heartbeat:

1 Like

It takes an extreme amount of instruction training to make a model follow prompts like that or be a chatbot. Like thousands of examples. Even having a system prompt “user” and “assistant” colors the output in a predictably unpredictable way.

You can increase the repetition and presence to lower the repetitious phrases. A middle temperature, enough to break the monotony but not enough to make it output crazy things. Then use a prompt that is for a completion engine. Babbage:

camilechat1

You are likely doing fine-tuning that has no purpose if you can simply lead the model into the one-shot output you want.

If you don’t want it running a whole conversation, you put a stop phrase in your API call, so that when it does the “Camile” turn, that prefix is what halts the output and is not displayed.

1 Like

Thanks for the reply and detailed example. But how about “put a stop phrase” in my API call? I mean, if I test the model in the playground, is there anyway to control the finetuend model to just generate its own turn of the conversation and not take the “user”'s part?

1 Like

Does your training data have the same stop word? Can you show an example of your training dataset? How many epochs did you use? How many sample lines in the JSON?

1 Like

Yeah, the stop word is automatically added by the OpenAI Clip tool. It’s about 300 samples for the training set and 90 sample for the validation set. Each sample is composed of the dialogues this person conducted with different people and situation under which the dialogue take place, like:

1 Like

I’ll just paste some of my own bot code. Where the typical exchanges are prefixed with “AI:” or “Human:”, which is how most show it how to respond in the few-shots before the actual question, you just put the word Human: as the stop phrase so that you only get one answer from the chatbot instead of it continuing to simulate more conversation:


try:
                response = openai.Completion.create(
                    model=ai_model,
                    prompt=self.generate_prompt(),
                    temperature=softmax_temperature,
                    max_tokens=response_tokens,
                    top_p=1.0,
                    frequency_penalty=0.0,
                    presence_penalty=0.0,
                    stop=["\nHuman:"]
                )

            except openai.error.Timeout as e:
                #Handle timeout error, e.g. retry or log
                print(f"OpenAI API request timed out: {e}")
                pass
            except openai.error.APIError as e:
                #Handle API error, e.g. retry or log
                print(f"OpenAI API returned an API Error: {e}")
                pass
...

It seems your fine-tuning used User: and Sophia:, and continues chatting with itself. It also looks like it creates an extra carriage return that could also be a signal. You can stop it at \n\nUser:

1 Like

Fine-tuning is not easy. It needs training and evaluation and keeping track of its performance. It may take a year to fine-tune a model to be as good as ChatGPT. ChatGPT itself is the byproduct of fine-tuning base models.

2 Likes