Fine tuning data format for chatting history

I am finetuning gpt3.5 model for meal search app in a conversational way, such as it should have a context of what I am talking about.
My question is, does the open ai api train all the assistant responses from message item as output training example or it just took the last assistant response to train the output, while putting all the previous responses as input context?

For example all the conversation in a single message item.

jsonl format

{"messages":[
  {"role":"user","content":"i would like to eat biryani"},
  {"role":"assistant","content":"Do you have a specific type of biryani in mind or do you want me to search for the general biryani item?"},
  {"role":"user","content":"all items"},
  {"role":"assistant","content":"sure let me do this", "function_call":...},
  {"role":"function","content":"tool call response ... "},
  {"role":"assistant","content":"Here are some of the delighted biryani items including ..... do you want to try chinese items as they are also rice base option??"}
]}

for seconds case:

{"messages":[
  {"role":"user","content":"i would like to eat biryani"},
  {"role":"assistant","content":"Do you have a specific type of biryani in mind or do you want me to search for the general biryani item?"}
]}
{"messages":[
  {"role":"user","content":"i would like to eat biryani"},
  {"role":"assistant","content":"Do you have a specific type of biryani in mind or do you want me to search for the general biryani item?"},
  {"role":"user","content":"all items"},
  {"role":"assistant","content":"sure let me do this", "function_call":...}
]}
{"messages":[
  {"role":"user","content":"i would like to eat biryani"},
  {"role":"assistant","content":"Do you have a specific type of biryani in mind or do you want me to search for the general biryani item?"},
  {"role":"user","content":"all items"},
  {"role":"assistant","content":"sure let me do this", "function_call":...},
  {"role":"function","content":"tool call response ... "},
  {"role":"assistant","content":"Here are some of the delighted biryani items including ..... do you want to try chinese items as they are also rice base option??"}
]}

So which format is better for a model to train and understand the context better and respond accordingly?

Hi and welcome!

When you fine-tune a model based on multi-turn conversations, you typically treat one conversation chain as one training example. As a default, all of the conversation would be taken into account during the fine-tuning. This is basically equivalent to your first example above.

You can also see the logic here in the OpenAI materials.

I’ve never tried this myself so far, but there’s also the ability to assign weights to specific messages to select which ones should be considered or not during the fine-tuning. Information about this is also covered under the link.

1 Like

How did I missed that.
Thanks! much appreciated.

1 Like