Is this Finetuning approach right?

I’m using gpt-4o + RAG to generate documents as they’re made in my company. It’s working well but I need it to achieve better results, that’s why I’m finetuning it and these are the kind of entries I’m writing in my .jsonl:

When model replied correctly with a well made document:

{“role”: “user”, “content”: “Make me a document within this info”},{“role”: “assistant”, “content”: “Right document”}

When my model fails:

{"role": "user", "content": "Make me a document within this info ..."},{"role": "assistant", "content": "Wrong document"},{"role": "user", "content": "You failed in the following points: .... Make it again"},{"role": "assistant", "content": "Right document"}

Is this how I’m supposed to do it?

Hi, too little background to say something meaningful. Out of the box, it looks like the task is too complex to handle in one run. A better design of the workflow would probably help, but as I said, too little info (input size, format, type of doc, format, length, complexity, etc) to help.

3 Likes

Sorry, I’m actually looking for a more generic solution. Regardless of the task, I just want to confirm if this is the correct way to perform a fine-tuning job, or if the ‘I reply when it’s wrong and let it be when it’s okay’ approach is flawed :slight_smile: Thank you for replying!

1 Like