I’m fine-tuning GPT-4-mini to handle function-calling and conversational tone together.
What is the correct way to prepare the training data?
Some questions:
- Should
function_call
be included inside theassistant
message JSONL? - Is it fine if the
system
prompt varies between samples during fine-tuning? (In production it will be fixed.) - Should I use full conversations (longer flow) or short direct examples?
- Is it better to mix tone-training and function-calling in the same conversations, or keep them separate?
[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Can you get me the weather for New York?"},
{
"role": "assistant",
"function_call": {
"name": "getWeather",
"arguments": "{\"location\": \"New York\"}"
}
}
]
Is this the proper way? Or should I format it differently?
Any best practices for this kind of fine-tuning?