Fine-tuning: Starting a conversation

wmodes · November 1, 2023, 2:15am

Hello, new to the forum. I have lots of specific questions about fine-tuning that I will get into over time. For this one, I’ve searched all over and not found a definitive answer to this question.

I am fine-tuning gpt3.5-turbo for a conversational chatbot. At the moment, I’m focused on conversational starts. In general, I know the model responds to a prompt, but how about that first “How can I help you?” moment?

Unmodified, I was able to add a system message at the top of messages that (among other things) said:

  {
    "messages": [
      {
        "role": "system",
        "content": "...Start the conversation with some variation on, \"Hey, how can I help?\""
      }, {
        "role": "assistant",
        "content": "Hey, how's it going?"
      }, {
        "role": "user",
        "content": "..."
      }
    ]
  }

That worked well enough for the base model. But now I’m fine-tuning.

So, now, it’s not about telling, but showing. I have training dataset examples that demonstrate variations on the conversation start, reckoning 5 variations would be sufficient, but how to format them in an effective way?

Suggested strategies for a conversational start appeared non-existent. (Bard and ChatGPT had opinions, but don’t they always?) Here’s some strategies I heard or made up. I’ve pretty printed the JSONL for clarity.

Provide training examples that simply have the user role turn omitted, for example:

  {
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful, but not too helpful assistant.",
      }, {
        "role": "assistant",
        "content": "Hey, how's it going?"
      }
    ]
  }

Provide training examples that explicitly say in the system message that this is a conversation start (thanks, Bard):

  {
    "messages": [
      {
        "role": "system",
        "content": "Start of conversation.",
      }, {
        "role": "assistant",
        "content": "Hey, how's it going?"
      }
    ]
  }

Provide training examples that explicitly say in the system message that this is a conversation start but also contain other system messages:

  {
    "messages": [
      {
        "role": "system",
        "content": "Start of conversation. You are a helpful, but not too helpful assistant.",
      }, {
        "role": "assistant",
        "content": "Hey, how's it going?"
      }
    ]
  }

Provide training examples that demonstrate a user starting the conversation, then basically hide that start from the UI:

  {
    "messages": [
      {
        "role": "system",
        "content": "Start of conversation. You are a helpful, but not too helpful assistant.",
      }, {
        "role": "assistant",
        "content": "Hi"
      }, {
        "role": "assistant",
        "content": "Hey, how's it going?"
      }
    ]
  }

Regardless of the strategy, I had several variations for the model to learn:

{"messages": [{"role": "system", "content": "Start of conversation. You are a helpful, but not too helpful assistant."}, {"role": "assistant", "content": "Hey, how's it going?"}]}
{"messages": [{"role": "system", "content": "Start of conversation. You are a helpful, but not too helpful assistant."}, {"role": "assistant", "content": "Hey, how can I help?"}]}
{"messages": [{"role": "system", "content": "Start of conversation. You are a helpful, but not too helpful assistant."}, {"role": "assistant", "content": "We probably have limited time, so what can I do to help?"}]}
{"messages": [{"role": "system", "content": "Start of conversation. You are a helpful, but not too helpful assistant."}, {"role": "assistant", "content": "Hey, thanks for coming. Whacha need?"}]}
{"messages": [{"role": "system", "content": "Start of conversation. You are a helpful, but not too helpful assistant."}, {"role": "assistant", "content": "What's on your mind?"}]}

Experimenting with various strategies, the fine-tuned model usually started with some variation of my greeting, but variously often would just generate nonsense, suggesting to me that my strategies were not super sound.

Is there a suggested best practice, hopefully backed by solid results?

N2U · November 1, 2023, 2:40am

Hey mate and welcome to the community forum!

Great first post, let’s dig in:

The model is trained for a conversation structure that goes “system → user → assistant”. You will have much more success if you maintain that formatting.

You can save a few tokens by hard-coding the greeting in the UI. Think of it like a “prompt for the user”.

Remember that your fine-tuning dataset should consist of good examples that align with the behavior and style you want the model to exhibit. If you want the model to do more than just greet, you’ll need to include examples of this in your dataset.

wmodes · November 1, 2023, 4:30am

Thanks for the welcome and the kind words.

Ah yes, that’s a 5th strategy I considered. I suspect that’s what ChatGPT does, a hardcoded greeting with the first message sent after the user’s prompt.

However, for a chatbot – and I know this is a small thing – it’s not very artful. How we start a conversation, the living variation we bring to it, sets a tone. And so ideally, I want the model to have the first word.

Curious what successful strategies y’all have brought to this? Have you hard-coded that first utterance, or do you have a best practice that indicates to the model what its role is when the convo first starts?

_j · November 1, 2023, 4:38am

You cannot train as you show and expect productive results.
You show examples of running the AI with no user input and generation of random results for the same input.

You can use assistant messages for multi-shot API calling, like simulating an introductory conversation where the AI talks about its purpose or generates answers in the required format as examples to teach it. However the whole point of fine-tune is that you show the model how it can simply answer the user without needing instructions or guidance.

With gpt-3.5-turbo, we need the system message to set a whole new identity separate from how chat has been massively trained on general chat. If you always fine-tune and always call with the same 20 words of system prompt, that puts the AI very much in the space of “aha, I’m now completing this response as though I am a special mode”.

So, there is no point of extra AI assistant messages of introduction. You’d have to provide them again to activate that path that goes nowhere. You will instead show the AI in your examples that it is not so helpful by DOING: having it give not so helpful answers to a broad swath of user questions, and then it knows “megalobot” in the system message operates that way.

wmodes · November 2, 2023, 10:10pm

I shifted to hard-coded conversation starts, selected randomly from an array. Not ideal, but I was unable to reliably have the trained model return a starter greeting.

Topic		Replies	Views
Training gpt-3.5 to autocomplete for a niche domain and a specific writing style Community chatgpt	13	1896	July 25, 2024
Can I fine-tune the model without the prompt and answer for the "system" role? API gpt-35-turbo , chatgpt , api	12	6446	January 29, 2024
How does gpt-3.5-turbo fine-tuning work? API gpt-35-turbo , fine-tuning	10	1918	September 11, 2023
Zero-shot prompt works - now what? API question , gpt-4	9	1600	May 20, 2023
Fine tuned model gives empty responses API fine-tuning-problems	12	2080	December 15, 2023

Fine-tuning: Starting a conversation

Related topics