Hello everyone,
I’m facing an issue with fine-tuning models sequentially, and I could really use some help or insights. Here’s the process I’ve followed:
- First Snapshot: I started with an initial model and fine-tuned it using a specific system message and training data.
- Second Snapshot: Using the first snapshot as the base model, I fine-tuned it again with additional data, keeping the same context.
- Third Snapshot: I continued this process, taking the second snapshot as the base model and fine-tuning it further to create a third snapshot.
- Fourth Snapshot: This time, I took the third snapshot as the base model but changed the system message and context for training with a new set of data.
Now, here’s the problem: When I try to fetch a response based on the contexts and respective system messages of the first three snapshots, the fourth snapshot (which has a different system message and context) seems to be overwriting the previous contexts.
I was expecting the model to retain the distinct context and system messages from each snapshot, but it appears the latest training is influencing responses across all contexts, not just the one it was trained on.
Has anyone else encountered this issue, or does anyone have suggestions on how to maintain separate contexts for each snapshot?
Model type : gpt-3.5-turbo
Epochs : 3
Batch size : 1
Any advice or pointers would be greatly appreciated!
Thank you!