Finetuning with New Data on Existing Finetuned Model

Hello,

Before I embark on this journey and invest a significant amount of time into this, I’d really appreciate some feedback to ensure what I’m trying to achieve is possible.

Performing a classification problem. Input is a string of text, output is a classification (i.e group 0,1,2,3,4). Both the input and output has real meaning and use of a LLM seems like an excellent solution. I’ve currently performed this using GPT vector embeddings as features for a random forest machine learning model, however I’d like to experiment with GPT finetuning if possible.

Question: Once I’ve finetuned a model once, can I continue to append to that model? I.e: First I finetune on 1000 datapoints, then I finetune the same model again on the next 1000 datapoints, and so on …

Question: Is a finetuned GPT model the right tool for the job here (classification problem).

Highly any feedback, thanks

1 Like

Hey Harrison,

I’m working on something similar! As of this post you can now continuously fine-tune the same model. Just specify it again when you issue the tuning job object.

While the results I’m seeing are mixed, I suspect strongly that lots of data will make GPT-3.5-turbo a fine-tuning beast. Good luck!

I’d also suggest prompting GPT-4 with in-context examples of what you want. In this way, you can kinda build a tabula rasa classifier.

1 Like

Careful!
There are two fine-tuning APIs!
The old version supports continuation the new one does not.
Be sure to make the appropriate choices.

4 Likes

Thank you, I really appreciate your help ! Cheers :slight_smile:

1 Like

Thank you! I think this explains why I was unable to get some of the base models like ada, davinci, etc, to work. It would just return model error. Reading this, it looks like I should be using the old API for these models. Cheers

1 Like

We plan to support fine-tuning existing fine-tuned models, but the new API does not support this yet.

1 Like