Issues with Overwriting Context in Sequential Model Fine-Tuning

praneethkanakamamidi · August 28, 2024, 10:01am

Hello everyone,

I’m facing an issue with fine-tuning models sequentially, and I could really use some help or insights. Here’s the process I’ve followed:

First Snapshot: I started with an initial model and fine-tuned it using a specific system message and training data.
Second Snapshot: Using the first snapshot as the base model, I fine-tuned it again with additional data, keeping the same context.
Third Snapshot: I continued this process, taking the second snapshot as the base model and fine-tuning it further to create a third snapshot.
Fourth Snapshot: This time, I took the third snapshot as the base model but changed the system message and context for training with a new set of data.

Now, here’s the problem: When I try to fetch a response based on the contexts and respective system messages of the first three snapshots, the fourth snapshot (which has a different system message and context) seems to be overwriting the previous contexts.

I was expecting the model to retain the distinct context and system messages from each snapshot, but it appears the latest training is influencing responses across all contexts, not just the one it was trained on.

Has anyone else encountered this issue, or does anyone have suggestions on how to maintain separate contexts for each snapshot?

Model type : gpt-3.5-turbo
Epochs : 3
Batch size : 1

Any advice or pointers would be greatly appreciated!

Thank you!

jr.2509 · August 28, 2024, 10:34am

Welcome to the Forum!

I am not 100% sure but I believe that this is due to you relying on entirely new data for the fourth snapshot. You might succeed in getting the fourth snapshot to retain both the existing and new system message / context if you include data for each case in your training set. That said, that would only make sense if the core task is (nearly) the same in both cases.

What specifically are you fine-tuning for?

Topic		Replies	Views
After fine-tuning my base model for a new scenario,new scenario might have overridden the old one scenario? API	7	77	August 28, 2024
Custom model response not aligning with training datasets Community gpt-4	1	58	January 23, 2025
Incremental Fine-Tuning and Maintaining Conversation History API fine-tuning	3	949	March 17, 2024
Fine-Tuned Model Not Responding with Expected Answers API	2	289	November 6, 2024
Need Advice on Fine-Tuning GPT-4 for Multiple Output Types API	5	120	September 26, 2024

Issues with Overwriting Context in Sequential Model Fine-Tuning

Related topics