Fine Tuning model with correct and in-correct completions

Hey everyone!
i was wondering if there is a way to provide training data for fine-tuning with correct and in-correct completions.
For e.g. :
{“prompt”:“3 + 4”, “completion”:“7”}
{“prompt”:“1 + 2”, “completion”:" not 4"}

P.S. this is a really basic example intended to explain my question, the data that i will be working on is much more complicated than this.

Any help would be appreciated!

1 Like

hey @m-a.schenk,
Thank you so much for responding so quickly. If my model learns to associate “3 + 4” to “7” and “1 + 2” to “not 4” does it imply that my model has learnt something from the incorrect prompt or it has just learnt to reply with “not 4” every time “1 + 2” is asked?
To ask a follow up question is there any way possible and will it help for us to specify a correctness scale to the completions mentioned in the training data.