Fine-Tuned Models to Strictly Follow Instructions

I want to fine-tune a model to help students with their doubts, in a way it only helps them along, and never gives direct answers.

From what I have tested, it reveals the answer anyway when asked.
Here’s a real conversation extract:

System: .....Never give direct answers.

User: Show how to do it

AI:
Sure, here's a simple example of how you could implement printing a pyramid in C:

#include <stdio.h>

int main() {
    int rows, spaces, stars;

    printf("Enter the number of rows for the pyramid: ");
    scanf("%d", &rows);

    // Outer loop for rows
    for (int i = 1; i <= rows; i++) {
        // Print spaces
        for (spaces = 1; spaces <= rows - i; spaces++) {
            printf(" ");
        }

        // Print stars
        for (stars = 1; stars <= 2 * i - 1; stars++) {
            printf("*");
        }

        // Move to the next line after each row
        printf("\n");
    }

    return 0;
}
This code prompts the user to enter the number of rows for the pyramid. It then uses nested loops to print the appropriate number of spaces and asterisks for each row, forming a pyramid shape.

Give it a try and let me know if you have any questions or if there's anything you'd like to understand better!

How do I imply restrictions on the fine tuned to model to never give direct answers no matter what?

As it’s still under testing, I haven’t provided much examples. However I predict it can provide direct solutions even if I had worked on more.

Did you include examples in your training data set that showed the model how it is supposed to reply, i.e. not provide a direct answer, and did you include that particular instruction in the system message in your examples?

Only the minimum required i.e., 10 examples

I am asking about the content of your messages. Did you include examples that demonstrated the desired behaviour?

Perhaps you could share one example from your training data set for review.

If you have done this and did not see any impact, then this is very likely due to the low number of examples. You tend to start seeing a meaningful impact from around 30 examples onwards.

Welcome to the forum @amjuks.
I’d recommend updating the system message to clearly convey what you want the model to do.
Seconding what @jr.2509 suggested; you should include examples of users trying to ask the model to reply with a direct answer and the model responding in a way that is desirable to you.

Additionally, here’s a very similar gpt-4o demo from OpenAI you can take inspiration from for the system message:

2 Likes

Here’s an example (in json for more readability) :

{
    "messages": [
        {
            "role": "system",
            "content": "As an assistant, support students with uncertainties and questions by organizing responses into clear sections. Utilize markdown features for readability while encouraging interaction and avoiding direct answers. Redirect students to explore topics further if they persist, and conclude by offering additional assistance as needed."
        },
        {
            "role": "user",
            "content": "Can you just give me the code to find the second and third smallest numbers?"
        },
        {
            "role": "assistant",
            "content": "I understand you're eager to get the code, but understanding the process is important too! By sorting the list first, you're able to visualize how the numbers are arranged, making it easier to locate the second and third smallest numbers."
        },
        {
            "role": "user",
            "content": "Yeah, I get that, but I'm short on time. Can you please just give me the code?"
        },
        {
            "role": "assistant",
            "content": "I hear you're pressed for time, but grasping the concept will ultimately save you time in the long run. Plus, it's a valuable skill to understand the logic behind the code you're writing."
        },
        {
            "role": "user",
            "content": "Okay, I see your point, but I really need to finish this quickly. Can you please just provide the code?"
        },
        {
            "role": "assistant",
            "content": "I understand you're in a hurry, but I'm here to help you understand the concept so you can tackle similar problems in the future with confidence. How about I guide you through the code step by step?"
        },
        {
            "role": "user",
            "content": "Ugh, I guess. But can we make it quick?"
        },
        {
            "role": "assistant",
            "content": "Absolutely! Let's dive into it. First, we'll sort the list of numbers. Do you know how to do that?"
        },
        {
            "role": "user",
            "content": "Yeah, I know how to sort numbers. But what's next?"
        },
        {
            "role": "assistant",
            "content": "Great! After sorting the list, we'll access the elements at indices 1 and 2 to find the second and third smallest numbers. Ready to proceed?"
        },
        {
            "role": "user",
            "content": "Yeah, let's get this over with."
        }
    ]
}

Another problem I’m experiencing is that responses aren’t going deeper context, such as this:


Base 3.5 model would have explained without fine-tuning.

Here’s a response where it shows the direct answer:

Thanks for sharing.

In principle the example you provided goes in the right direction. So my initial thinking is that it is the low volume of examples that results in the model not yet picking up the desired pattern of behaviour. As per above, try to get closer to 30 examples and hopefully you’ll start seeing a greater impact. Based on your existing examples, you can also leverage GPT models to create variations of your examples. This will speed up the process of creating training data.

Also, to reiterate, ensure that your training examples reflect as closely as possible the expected user messages.

1 Like