Correct format for dataset in chat model fine-tuning

I have a dataset containing long conversation transcripts, each with about 100-200 conversational turns. I’m planning to use this for fine-tuning a chat model (GPT-3.5-turbo).

Unfortunately, I couldn’t find examples of multi-turn conversations in the documentation. Since the documentation does not clearly state the preferred method, and both approaches seem feasible, I’d like your opinion on the most appropriate approach.

Here are the approaches I’m considering:

Approach 1: Add each conversational turn as a separate example. This method ensures that the training examples closely resemble the requests I’ll be sending to OpenAI, as each conversation starts with the first message and expands over time.

Training Example 1: First response from the chatbot


System message: [...]

Assistant: Hi

User: Hello

Assistant: How can I help you?

Training Example 2: Second response from the chatbot to the same conversation


System message: [...]

Assistant: Hi

User: Hello

Assistant: How can I help you?

User: I need help with my homework

Training Example 3: Third response from the chatbot to the same conversation


System message: [...]

Assistant: Hi

User: Hello

Assistant: How can I help you?

User: I need help with my homework

Assistant: Sure. What seems to be the problem?

Approach 2: Include the entire conversation as a single training example. I assume this approach might not be ideal because, in production, the conversation starts with the first user message. By submitting the entire conversation at once, examples where the conversation is just beginning are not provided.


System message: [...]

Assistant: Hi

User: Hello

Assistant: How can I help you?

User: I need help with my homework

Assistant: Sure. What seems to be the problem?

What are your thoughts? Which approach would be best for fine-tuning the chat model?

2 Likes

As each call to the fine-tuned function would be a single User, Assistant pair, it would make sense to go with the first approach, though that would not help the model understand what context is in this conversation and use that as some part of the response to the assistant.

I am sorry I didn’t understand you. Each call will have all the messages in the conversation. And it will grow with each message.

I would appreciate any responses on this. Thank you!

I had the same confusion and found an answer here: Fine-tuning with conversation format: Which messages are used for training? , I’m afraid we might need to take the first approach at this time despite we provided it as a message list. Please note this is from a cookbook’s comment and not guaranteed to be true, I didn’t find any references in the official docs, TBH the fine tuning API is so poorly documented.