Is this Finetuning approach right?

ppardo · January 9, 2025, 1:18pm

I’m using gpt-4o + RAG to generate documents as they’re made in my company. It’s working well but I need it to achieve better results, that’s why I’m finetuning it and these are the kind of entries I’m writing in my .jsonl:

When model replied correctly with a well made document:

{“role”: “user”, “content”: “Make me a document within this info”},{“role”: “assistant”, “content”: “Right document”}

When my model fails:

{"role": "user", "content": "Make me a document within this info ..."},{"role": "assistant", "content": "Wrong document"},{"role": "user", "content": "You failed in the following points: .... Make it again"},{"role": "assistant", "content": "Right document"}

Is this how I’m supposed to do it?

sergeliatko · January 10, 2025, 7:01pm

Hi, too little background to say something meaningful. Out of the box, it looks like the task is too complex to handle in one run. A better design of the workflow would probably help, but as I said, too little info (input size, format, type of doc, format, length, complexity, etc) to help.

ppardo · January 23, 2025, 6:54am

Sorry, I’m actually looking for a more generic solution. Regardless of the task, I just want to confirm if this is the correct way to perform a fine-tuning job, or if the ‘I reply when it’s wrong and let it be when it’s okay’ approach is flawed Thank you for replying!

sergeliatko · January 24, 2025, 1:18pm

As a more general approach, all of those tools are good: RAG, fine-tuning, assistants, coding…

The question is are they adapted to your specific goal? And here the goal is not clear. Because from the little snippets I see, for me, they don’t even fit into one step, so fine-tuning here would basically break your application. But then without the additional information about what are you trying to achieve it’s like writing on the water with a stick.

sergeliatko · January 24, 2025, 7:04pm

Step/operation one: generate a great document.
Step/operation two: grab the evaluation criteria from settings defining what is good vs what is wrong
Step/operation three: evaluate the document generated
Step/operation four: read response and either reply with “wrong” or pass the doc further down the flow.

#1 and #3 work better when fine-tuning is involved (in #1 RAG+ code in multi steps is also good to help the fine-tunes models).

#2 - code + RAG (optional)
#4 - code

Topic		Replies	Views
My model needs... motivation? API assistants-api	14	169	October 29, 2024
Fine tuning for writing style - lessons and questions API fine-tuning	5	2788	January 17, 2024
Fine-tuning for text classification / finding relevant parts in huge documents Community fine-tuning	3	105	December 2, 2024
Why do some problems after fine-tuning the large model not match the answer API fine-tuning , api , fine-tuning-vs-rag	3	340	August 11, 2024
Help me decipher my very first fine tuning report? 🧐 API gpt-4	12	123	November 1, 2024

Is this Finetuning approach right?

Related topics