Please, does anyone have some extensive documentation on fine tuning, in particular for hyper parameters. I’ve got 2,777 training snippets, and 344 validation snippets. I’m trying to teach gpt-40-mini my programming language, and the result is crap!
I’ve tried every tweak with hyper params, but nothing changes - Regardless of what I do, the resulting model is “junk” …
Imagine your programming language as just another language the AI would have to “speak” and understand. OpenAI trained GPT-2 on 40GB to produce language completions, and it is miles behind what can be done in the present day after chat models are further post trained on tasks by the millions in RLHF.
So we have to think about what can be done, when fine-tuning is far more effective for behavior than for imparting knowledge.
And for me, that would be fine tuning on using RAG to iteratively research language documentation. Train on calling a knowledge function, then calling it again, until obtaining the exact context needed to perform the task.
You already have a pattern of input → desired output
What could enhance this is input → tool call → tool return → tool call → tool return → output.
The function calling could be synthetic yet automatic addition to your training set once you have built the knowledge base: Run a AI that is made into such a programmer with just system prompt, informed that it must have a full solution from the education-by-tool. It goes off and makes the calls, getting back the documentation. The final answer (that might be good or not) is discarded because we just need to capture the tool calling and the knowledge for example of how production will be powered. Then insert as fine-tuning turns that sequence of function-calling.
You can see that in this case, the fine-tuning would just add another layer of text prediction onto what the normal AI could produce from reading context. That’s the best use I could imagine of fine-tuning for an uphill battle.
I have already tried this, but it produces too many errors. I’ve created a loop that loops through my documentation extracting RAG data with VSS, and generates training snippets in a loop, but it’s not producing good enough results …
I suggest your high quality RAG prompted with a high-quality tool-using AI would generate the tool call and tool return data needed to insert into your training file. Real data.