Edit: Seems conclusively impossible to do what I’m after at the moment.
Hi I’m trying to get GPT-3.5-turbo to be fine tuned to understand a specific syntax in generated input. I want to be able to include examples or poor output in the dataset, as well as good examples. Previously with prompts I’ve done this by creating a history to mimic ChatGPT to describe what is good and bad. But when fine tuning, it seems instead the best idea is to have one prompt and one response rather than a conversation, and make up for this with a large dataset.
Despite this, I’ve been getting a lot of invalid completions. I was wondering what approach others have had most success with for fine-tuning with negative examples. Here are some prompts of what I’ve been playing with:
The system message before all examples in the fine tuning file:
<removed>
Then I currently have a list of positive examples which match this rule, like
<removed>
Which would be valid.
But I want to include negative cases in my training data (so whenever the current model is wrong, I can explain why, and it doesn’t make those mistakes again).
What approach works best for this? My first thought was something like:
Positive training:
{ role: "user", content: `{"a",1},{"b","c",2},["d",3]` },
{ role: "assistant", content: `The {"c",2} jumped over the {"a",1} which led to ["d",3]`}
{ role: "user", content: "Correct, that is valid"}
Negative training:
{ role: "user", content: `{"a",1},{"b","c",2},["d",3]` },
{ role: "assistant", content: `The {"c",2} jumped over the {"a",77} which led to ["d",3]`}
{ role: "user", content: "Incorrect, that is invalid. {"a",77} should be {"a",1},` }
But I’m unsure if this is fine tuning the model to produce correct output, or whether it is fine tuning it to classify whether it’s output is bad or not (but doesn’t alter it’s behaviour to prefer good output). In other words, that the model is happy to make a mistake and knows that the user will correct it in the next response, vs the model actually caring that the user has said something negative after its response and will try and avoid that in real use.
Also FYI if anyone in charge of the docs reads this, the JSON example under structured input in the docs here seems invalid (not escaped quotes correctly): OpenAI Platform