Fine tuning the model for our specific use case?

In the GPT-2 project on Github, there is a “finetune” option to modify the weights in the training model with your own corpus of text. Something like:

$ gpt_2_simple finetune ~/Documents/scifi.txt

To teach the model about relationships concerning science fiction, or whatever.

Is there some way to add API-key-specific knowledge to our own models? It would seem like a great way to make expert systems, or add application-specific knowledge for responses.

The amount of data wouldn’t be that large, but well over what one would want to hand in as tokens in a prompt.

I have no idea how practical this would be on the backend to maintain multiple training models at this scale though.


Hi there,

Currently, we have a minimal capacity for supporting fine-tuning projects. We’re working on building a self-serve fine-tuning endpoint that will make this accessible for all users, but I can’t offer any concrete timelines.

If you don’t mind me asking - what’s the use case you’re working on? Maybe we’ll be able to find a different solution for now.


Thanks. A fine-tune endpoint would be awesome. Good to know it’s in the works.

I’m still at the “wacky ideas” stage, trying to see what’s possible, so totally fine if there isn’t a way to do this. But, I’d like to try and make a “virtual me” chatbot. That is, fine-tune the model on a corpus of things I’ve written (e.g. chats, emails, slack, etc… ) and then do rubber duck debugging by “talking to myself” as it were.

Current chatbot functions respond like the amalgam personality of the training model. They could probably write in the style of a reasonably famous person, or be supportive and friendly, but I’d like to emulate my own unique language style.

The only way I could think to do that would be like the GPT-2 fine tuning function, with a substantial text corpus of my own writings, to adjust feature weights.

Just providing a few paragraphs of prompt tokens doesn’t seem sufficient.


1 Like

Interesting idea… that might work.

I haven’t played with search endpoints yet. but basically, load one up with my writings as individual documents. Do N completions, and send each of those to Search, and rank my completion outputs by number of matches in search. Basically saying the one that “sounds most like me” would have the largest number of Search hits.

Hmmm. That would likely be prohibitive in tokens as you say, but I may try it in Ada just to see what I get. First I’d have to break content up into search chunks.

Thanks for the idea!