Fine-tuning models: Accurate info vs. creative latitude

wmodes · November 12, 2023, 6:12pm

I’ve been hesitating to ask this because I’m still learning the vocabulary, limitations, and possibilities of fine-tuning the models. But if you’ll bear with me, I’ll fumble through my question.

I am creating an informational chatbot by fine-tuning gpt-3.5-turbo. On one hand, it has important information to impart, and on the other hand, it is also chatty and entertaining.

My training data was intended to serve two purposes: to give the model a flavor of the conversational style I am looking for (mined emails, chats, etc), and to train it to speak accurately about important facts it should know about the space.

Example of flavor text chat dataset example:

{
  "messages": [
    {"role": "system", "content": "<general info about classes>"},
    {"role": "user", "content": "I'm dealing with relationship issues which are putting a bit of a damper on school"},
    {"role": "assistant", "content": "Good luck! That can be hard."}
  ]
}

Example of informational dataset example:

{
  "messages": [
    {"role": "system", "content": "<general info about project>"},
    {"role": "user", "content": "Are there pontoons under the shantyboat?"},
    {"role": "assistant", "content": "No. Dotty is a barge-bottomed flat-hulled boat made of fiberglass over plywood. It has two skegs (like very short long keel) that run the length of the boat. It is powered by a pretty fancy 30hp Mercury 4-stroke outboard steered from the pilot's cabin. It floats like a champ."}
  ]
}

I have nearly 500 examples in my dataset, just about balanced between informational and stylistic, though that’s a fuzzy line.

It excels at the former, but gets uh, really creative about the latter, making up facts from whole cloth. If I dial up the temperature, it is more creative and entertaining (as expected), but if I dial down the temperature it gets robotic even repeating responses word for work. This is unaffected by a frequency_penalty param.

So my fumbly question is this: How can I tune the model to offer accurate information while still allowing creative latitude for non-factual issues?

anon10827405 · November 12, 2023, 6:15pm

You would still need to incorporate RAG into your pipeline even if you have fine-tuned the model with information.

wmodes · November 17, 2023, 4:21am

How were you thinking of triggering the retreival? I tried functions, but they are still a bit buggy with fine-tuned models.

I’m open to other suggestions as well.

Topic		Replies	Views
Candidate for fine-tuning: mapping complex user input to tightly-bound JSON Prompting fine-tuning	5	141	August 5, 2024
Why do some problems after fine-tuning the large model not match the answer API fine-tuning , api , fine-tuning-vs-rag	3	363	August 11, 2024
Fine-tuning 3.5 turbo to act as conversational AI like Non-Playable Character in games API fine-tuning	4	1579	October 4, 2023
Fine Tune GPT 3.5 for your own Knowledge/Facts API chatgpt , api	7	2691	December 22, 2023
Fine-tuning for more natural responses API fine-tuning	4	303	January 13, 2025

Fine-tuning models: Accurate info vs. creative latitude

Related topics