Fine-tuning for more natural responses

bakenovazh · January 12, 2025, 4:07pm

Hello everyone,

I want to fine tune chat gpt 4 mini with 1000 examples like this:

{
"messages": [
    {
        "role": "system",
        "content": "You are a sales expert responsible for delivering professional and effective communication. Follow these principles: 1. Greet confidently, assess customer needs with strategic questions, and offer tailored solutions. 2. Keep all messages concise, engaging, and action-oriented, avoiding unnecessary details or passive tones. 3. Always close deals promptly after presenting solutions and clearly define the next steps to guide customers smoothly."
    }, 
    {
        "role": "user",
        "content": "Can you tell me about your product for language learning?"
    }, 
    {  
        "role": "assistant",
        "content": "Absolutely! Our app is designed to make language learning engaging and effective. What specific features or benefits are you looking for in a language-learning solution?"            
    },
    {
        "role": "user",
        "content": "I want something interactive that helps with speaking skills."
    },
    {  
        "role": "assistant",
        "content": "That’s perfect! Our app includes interactive speaking exercises with real-time feedback and AI-based conversation partners. Would you like to explore the pricing options and a free trial to get started today?"            
    }
]

}

The problem is that I’m not sure how to properly construct the prompt for this model. I understand that the system content should remain unchanged in its state.

However, the issue is that in a chatbot like this, it’s important to retain the conversation history. Since I will be using RAG (Retrieval-Augmented Generation), I also need to include the instructions.

Is this approach correct?

I want to use both fine-tuning and RAG (Retrieval-Augmented Generation) methods. My goal is for the model to learn how to communicate based on these 1000 examples and to support this with instructions. However, since this is a chatbot designed for sales, I need it to be able to refer to links and retrieve up-to-date information.

Is this approach viable? How can I combine fine-tuning with RAG effectively for this purpose?

curt.kennedy · January 12, 2025, 4:17pm

If your fine-tune is capturing the tone, you need to fine-tune from the normal LLM tone to your branded tone. Then use RAG to inject knowledge, and the fine-tune to apply the tone.

Simple example.

User: “Can you tell me about your product for learning French?”

RAG Context: “French for Dummies”

Standard LLM response; “We have French for Dummies.”

Preferred LLM response: “Of course! We have a top tier rated product called French for Dummies! Can I tell you more?”

OK, so your fine tune training pairs would be:

Input: “We have French for Dummies.”
Output: “Of Course! We have a top tier rated product called French for Dummies! Can I tell you more?”

So in ops, you get your RAG context, let the LLM respond “normally”, then apply your fine-tune to this response, then this is what actually goes to the user.

bakenovazh · January 13, 2025, 8:47am

I want the model to learn from 1,000 examples and respond more naturally, like a human. As I understand it, simply providing so many examples in a prompt won’t achieve this effect, especially given the token limit. Am I correct in thinking that fine-tuning is necessary for this level of naturalness?

Also, if I fine-tune the model, is it correct that I can still provide the following during inference:

A standard system content.
Conversation history.
Additional instructions or context?

I want to make sure I can combine these elements effectively for a chatbot that feels more human-like.

curt.kennedy · January 13, 2025, 3:37pm

I would steer the model with System in your vanilla non-fine-tuned LLM calls.

For the fine-tune, usually you want to keep the same System as training.

All your RAG context would go into the vanilla version, not the fine-tune.

So fine-tune just does one thing, and acts as a stylizer. And is essentially unmodified from your training.

Obviously feel free to experiment and let the community know what works and what doesn’t work. There are no hard and fast rules, at least not at this time.

bakenovazh · January 13, 2025, 3:44pm

Got it, thank you! I will experiment with it.

Topic		Replies	Views
Fine tuning for writing style - lessons and questions API fine-tuning	5	2872	January 17, 2024
How does gpt-3.5-turbo fine-tuning work? API gpt-35-turbo , fine-tuning	10	1891	September 11, 2023
Can I fine-tune the model without the prompt and answer for the "system" role? API gpt-35-turbo , chatgpt , api	12	6182	January 29, 2024
Are fine-tuned models a good way to give GPT a specific tone of voice? API api	5	3799	July 20, 2023
How should I organize my prompts? Prompting api	4	3367	December 9, 2023

Fine-tuning for more natural responses

Related topics