Fine tuining GPT-3.5 while incorporating human feedback

I am developing a fine tuned model for a client (a lead gen agency) to help them write emails for their clients by learning from the unique style, tone & structure of their existing content.

I have fine tuned a model and the outputs are probably 60% there but need some improvements. I have built an application that allows them to rate the outputs, provide feedback, and save the final versions they use.

How can I incorporate this into a new fine-tuned process to help reinforce the feedback they provide and the differences between the initial AI output and their final version?

Practically, is it possible to fine tune a model not only using:
‘system’, ‘user’, ‘assistant’ messages
but ‘system’, ‘user’, ‘feedback’, final output’ type approach?

Would this work or do the fine tuned models not learn in this way?

Any help would be much appreciated.

Hi - your suggested approach is not possible.
You can continue to fine-tune an already fine-tuned model. However, the data set needs to be consistent with the standard conventions and therefore must follow the system, user and assistant messages approach.

1 Like

The feedback you are collecting represents additional, potential training items.
If these new samples are very close to the initial data used for fine-tuning then it makes sense to continue training the existing model.

If however you find that there are differences you can consider to start fine-tuning from scratch with a higher quality dataset.

You can also try sharing some examples and the settings used for fine-tuning.


As mentioned above by @jr.2509, the conventions for the input into the fine-tuning setup are fixed, so altering them is not possible. However, there is another approach you could apply.

Just an additional step to incorporate the feedback into the fine-tuned dataset, you could seperate out the generations for which feedback was given, negative specifially, and run another script which would essentially generate a “golden sample” by rewriting the email but with the feedback incorporated into it.

This output you could then use as a sample in your training dataset

Thanks so much for your responses!

What I think you’re suggesting is that the only way to fine tune a model is to provide positive examples for it to learn from, and the ability to reinforce behiviours with negative examples, or descriptive feedback isn’t supported in the current fixed setup.

So, to incorporate the negative feedback, we’d have to turn this into a positive example for the model to learn from rather than explaining why it is bad.