How does gpt-3.5-turbo fine-tuning work?

Lawrence111 · August 25, 2023, 7:24am

Recently, I’ve been attempting to fine-tune gpt-3.5-turbo, but I’ve found myself a bit puzzled while going through the relevant documentation. The documentation provides this format as an example:

{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already."}]}
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}, {"role": "assistant", "content": "Oh, just some guy named William Shakespeare. Ever heard of him?"}]}
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "How far is the Moon from Earth?"}, {"role": "assistant", "content": "Around 384,400 kilometers. Give or take a few, like that really matters."}]}

This is fairly straightforward. As there’s only a single pair of conversation, the assistant’s role serves as the “target value” in the training process. However, I’d like to include more context in the training set, which would involve more than 3 messages and potentially imperfect alignment, like this:

{
    "messages": [
        {"role": "system", "content": "..."}, 
        {"role": "user", "content": "..."}, 
        {"role": "assistant", "content": "..."},
        {"role": "user", "content": "..."}, 
        {"role": "assistant", "content": "..."},
        {"role": "assistant", "content": "..."},
        {"role": "user", "content": "..."}, 
    ]
}

Or this:

{
    "messages": [
        {"role": "system", "content": "..."}, 
        {"role": "assistant", "content": "..."},
        {"role": "assistant", "content": "..."},
        {"role": "assistant", "content": "..."},
        ...
    ]
}

In such cases, how will the training program handle these types of samples? I’m eager to find out.

Foxalabs · August 25, 2023, 10:05am

Only assistant roles would not normally occur in a conversation, you could try user roles with nothing in the content section, I am also interested in your findings.

_j · August 25, 2023, 10:36am

Rather, it is “how will the trained AI handle your normal chatbot conversation turns and conversation history when you’ve trained it on a mix it won’t see from your application. When you use fine-tuning to train the AI and example doesn’t show correctly how to follow conversational context…”

More intriguing is that they’ve only shown these three turns, always with a system message. What about behavior with no “system” in fine-tuning? Or what about when you want the trained behavior, but far later than after a system prompt would be seen? More things you have to pay to experiment with.

Not correct. The input to a chat API call can be a very long conversation history with both user and assistant roles as prior inputs and responses. You are training the AI how to respond to what it receives.

For example, you might tune the AI on how to summarize or rewrite what it just produced (or other skills it doesn’t actually already have) by supplying a longer conversation with some history. It can be weighted by how the response is formed from the assistant role contents along with the instruction.

Foxalabs · August 25, 2023, 10:41am

How would a typical conversation with the model contain many assistant prompts one after the other? Sure you could train the model with anything, I’m simply pointing out that a “typical” training set is: user message <some text> then assistant message <some text> rinse, repeat, with the possible inclusion of a system message. Any of those messages could be blank, but not “missing”.

_j · August 25, 2023, 10:43am

How would a typical conversation with the model contain many assistant prompts one after the other? By vector context management the likes of what ChatGPT uses that drops what doesn’t convey information.

Also by injection of knowledgebase data augmentation before the user question.

Foxalabs · August 25, 2023, 10:45am

You seem to be making the assumption that ChatGPT uses vector storage of prior chat text, no such evidence exists, at least that I am aware of.

_j · August 25, 2023, 10:46am

I have such evidence, but the nicely formatted chat history dump by role is also full of jailbreaky stuff to make it happen.

aisidier88 · September 11, 2023, 10:47am

How can I fine-tune chatgpt to make AI responses more vivid or logical

Foxalabs · September 11, 2023, 10:50am

You would have to give it examples of what you find vivid and logical, the more the better. They should be of in the form of Question and Answer pairs.

aisidier88 · September 11, 2023, 11:05am

Can you give an example, such as how I can fine-tune multiple rounds of conversations in sales. Thank you very much

Foxalabs · September 11, 2023, 11:16am

Make the customer the user roll and the sales person the assistant roll and use blocks of conversation per training element, start with introductions, then a mid game and a closure section for lots of sales examples.

Note that the AI will not learn the contents of the sales pitches or the interactions, instead it will learn how the assistant spoke, it will have the same style.

Topic		Replies	Views
Fine-tuning chatgpt3.5, whether system can be rewritten as a prompt API chatgpt , fine-tuning	5	847	September 11, 2023
Fine-tuning for more natural responses API fine-tuning	4	203	January 13, 2025
OpenAI Fine-Tuning: Multi-turn Dataset Examples API openapi , fine-tuning , gpt-3	6	8678	December 14, 2023
How to fine tune QA model with context using gpt-3.5-turbo Community gpt-35-turbo , fine-tuning	9	2135	December 17, 2023
Fine-Tuning 3.5 Turbo for writing style/tone API	1	1609	September 27, 2023

How does gpt-3.5-turbo fine-tuning work?

Related topics