Help me decipher my very first fine tuning report? 🧐

fredkzk · November 1, 2024, 2:30pm

I have used a 3700-line jsonl dataset to fine tune gpt-4o-mini-2024-07-18 with default hyperparams.
Trained tokens: 2,091,327
Epochs: 3
Batch size: 7
LR multiplier: 1.8
Seed: 1658117049

Based on your experience, how to consider iterating on hyperparameters?
My tests currently produce okay results, but not precise enough.

I’ve already looked at Iterating on data quality
Collect examples to target remaining issues
Scrutinize existing examples for issues
Consider the balance and diversity of data
Make sure your training examples contain all of the information needed for the response
Look at the agreement / consistency in the training examples

Thanks

curt.kennedy · November 1, 2024, 3:51pm

It looks like you are aware of all the basics.

For me, what tends to bite me for classification is that I may have severely imbalanced labels, and I need to generate more training data to compensate.

But you didn’t mention what your fine-tune is doing, is it a classifier of sorts?

fredkzk · November 1, 2024, 4:08pm

Thanks for your input Kurt.
AI was tasked to analyze, understand and identify every topics and subtopics across the whole Deno documentation and generate jsonl entries accordingly, so I am confident the labels are fairly well balanced / distributed. There are 3700 lines!

When I ask chatGPT with the new search feature “What is Deno 2 backward compatible with?” it answers perfectly well but my fine tuned 4o mini outputs a vague, albeit correct, answer, yet the dataset contains a line with the very answer:

{"messages": [{"role": "system", "content": "You are a helpful assistant, expert in web development and the Deno JS runtime environment. You answer professionally and concisely, always double-checking your responses for accuracy."}, {"role": "user", "content": "Is Deno 2 backwards compatible with Node.js and npm?"}, {"role": "assistant", "content": "Yes, Deno 2 is backwards compatible with Node.js and npm. This means you can run existing Node.js applications using Deno 2, allowing for seamless integration and the ability to leverage Deno's features in your current projects."}]}

Is that too long a line? Do I have to increase Epochs two fold?

Not familiar with “classifier”, sorry.

anon10827405 · November 1, 2024, 4:36pm

You may be better off with a robust RAG solution that allows you to stay on-top of changes in the documentation. This can be done in many ways, like hashing the page contents.

Think about it, each time the documentation updates you will be required to fine-tune the model all over again

It would be very difficult to beat SearchGPT as it can always find the most up-to-date information

fredkzk · November 1, 2024, 5:21pm

Will RAG be efficient enough for AI coding assistance purposes?

On the other hand, when openai opens up an API endpoint for SearchGPT, this may well replace many fine tuning use cases…

_j · November 1, 2024, 5:23pm

The last 1/3 looks like the loss isn’t going anywhere, but if you simply need more obedience and overfitting, you can submit another job with your created fine-tune model as the input model name, perhaps with another 2 epochs, and it will continue training based on the existing fine-tune model to deepen the weights.

Remember that you must be reusing the system message and as similar of input as you trained on to activate your training; you can’t expect to have the quality if you write a completely different message. You are ultimately training a small model that only performs ‘chat’ for OpenAI because of their own extensive training.

fredkzk · November 1, 2024, 5:31pm

Thanks, I will launch a new fine tuning session then.

RE system message: you mean that the second tuning should basically use the same system message in the dataset as in the 1st dataset? If so, yes I would like to reuse the dataset, so same everything.

anon10827405 · November 1, 2024, 5:37pm

Yes.

There are IDEs that have this feature built-in, like Cursor.

Cursor comes with a set of third party docs crawled, indexed, and ready to be used as context. You can access them by using the @Docs symbol.

Add Custom Docs

If you want to crawl and index custom docs that are not already provided, you can do so by @Docs > Add new doc. The following modal will appear after you’ve pasted in the URL of your desired doc:

_j · November 1, 2024, 5:37pm

Yes, the system prompt should be seen more as an activation of your training against what already exists, rather than an instruction, although instruction-following is already there.

If your input and output was completely un-chat-like, you’d want to depart even farther from an instruction that says “you are chatgpt”.

When you use an existing fine-tune to create a new tuned model on the same data file, the result is similar to if you had specified more epochs from the start – without having to pay for the whole thing over again with a higher token expense.

fredkzk · November 1, 2024, 5:41pm

Indeed, familiar with this, I use aider in ZED Blazing fast.
Issue with adding docs “on the fly” is the tokens consumption, hence my tuning… or RAG if that would be efficient.

fredkzk · November 1, 2024, 7:24pm

The model’s quality got worse.
Fine-tuning sequentially (one dataset after the other) can lead to what’s known as catastrophic forgetting. This occurs when the model starts to overwrite knowledge from the first fine-tuning during the second, diminishing its performance on the initial data.

Quite a cold shower for my first try at fine tuning with a 3700-entry dataset.
I find the fine tuning method quite archaic and prehistoric tbh. Reminds me of the early days of the internet with the modem and its cumbersome setup.

curt.kennedy · November 1, 2024, 7:29pm

Like others have said, you should really look to for a RAG based solution (this is a form of “search” too).

Then the prompt will have the latest info, no fine-tune is needed.

RAG for knowledge. Fine-tune for style or tone (or classification).

fredkzk · November 1, 2024, 8:05pm

Got it, and I can use no code tools such a n8n to implement it.
Thank you and thanks to @anon10827405 too.

Topic		Replies	Views
Fine Tuned Chatbot forgets how to output summary of conversation API	9	1805	December 18, 2023
Best approach for adding knowledge to base model API fine-tuning , rag	4	1416	February 7, 2024
Fine tuning GPT 4o or 4o-mini on our codebase API fine-tuning , code	5	1413	September 14, 2024
Extensive documentation about fine tuning API gpt-4	3	221	March 27, 2025
Fine tuning vs. multishot questions API	7	2628	March 25, 2023

Help me decipher my very first fine tuning report? 🧐

Add Custom Docs

Related topics