I’m trying to train a classification bot. But unlike chatgpt, my davinci model lacks a lot of background in natural language. For example, if I ask chat gpt “what does nib mean in this? “apple macbook pro 13.3" 8gb 512gb m1 latest model space gray myd92ll/a 2020 nib.”” it will response with “new in box”; knowing this is critical to being able to classify things correctly. I believe that currently I would need to train davinci or whatever else to have that same level of insight and then fine-tune on top of it.
Is there a way to fine tune the base model for chatgpt for my own use case? If not, is there anything I can do to approximate the same level of insight from the bot?
I see that text-davinci-003 seems to be the model I’m looking for. But I’m not able to fine-tune it. Why not? Is there a way around that limitation? How does chatGPT maintain context of the conversation? Does it incrementally fine-tune?
We’ve put together that the ChatGPT model has at least a 4096 token window which allows it to “remember” more. I doubt they’re fine-tuning between conversations because of how compute intensive that would be.
If you look at some of the GPT-3 chatbots on GitHub, you can see how the craft the prompt to remember at least some of the context of the conversation.