The fine-tuning seems to be deigned for making the models do a certain task, e.g. just do specifically customer service etc. Imagine now an application where it does need to know a lot of information, e.g. as a company I want it to have extensive knowledge about my products, which I have readily available as wiki and documentation. Or as a personal assisstant, it would be cool if it knew in detail everything from the last 10 years of my diary. Is there any way I can “continue normal training” (i.e. just next token prediction) or a trick/best practice to “fine-tune” it on my personal/company internal data?
This is where embedding comes in. There are many posts on this, but briefly, you have a database and embed all of your specific facts. Then when an input question/comment comes in, you embed this too and find all the similar embeddings in your database. Then you generate a prompt based on the closest embeddings, and let GPT-3 respond to the question/comment based on these ‘facts’ or phrases in your database that are closest.
Thank you for your response! I’ve read about the embeddings option, but it seems not really useful for this task to me. If I want davinci to help with complex decisions of my life, it needs a decent understanding of past interactions I had with various people in various contexts, for this much much more content is needed than fits in a short 4k token prompt. Similarly, the documentation of technical products of a company is so big, and typical questions often require so many different bits of knowledge from documentation and code across the entire codebase and documentation in combination, the level of understanding I need here goes beyond simple text search methods to pump up prompt volume a bit.
I was thinking of generating a fine tune file, where it always just has to predict the next word of a sequence and thereby get it to know things inside of prompt and completion sequence, like
{prompt: "My girlfriend is ", completion: “Anna”}
Any chance something like that works?
I definitely understand your situation. I posted thoughts along similar lines recently here:
The logical conclusion I came up with is using a fine-tune.
For example, create a fine-tune of all your responses to questions or comments, then create a different fine-tune of all of your generations questions and comments.
So basically, since a fine-tune is a prompt completion pair, you create one set that has ‘prompt’ anything you would say, and another set where ‘completion’ is anything you would respond with.
Then you use each model separately depending on the context. And theoretically this is the virtual you!
I haven’t tested this out, but it should contain much more information than the size of the prompt window.
Thank you that’s actually a good idea! Sounds like much more work needed as compared to a more loose API, but definitely a great start!
Hello Curt!
I think your aproach is very interesting, do you have any artycle with more details or implementation? I would like to undestand step by step how can I build a virtual assistant with specific knowldge about my products and theris features. I have explored the fine tunning modesl and the embedding models but I don’t know how combine them to build a single solution for my objetive.
What are you looking for, the AI using your knowledge, the AI using your personality, or both?
First, IA using my knowledge. I understand that the fine-tunig model is useful to learn new tasks, not for learn my new knowldge. I have specific knowldge related to my products like prices, reviews, ingredientes, califications, description, etc. I would like to build an IA model to respond to my clients and help them find the best product according with their needs written in the prompt.
Second, if I want my personality in the chatbot, this is a fine-tunig problem, right?
Fine-tuning, IMO, is best for categorization and you can try using it experimentally to create your own voice. For the voice/personality, look at my latest comment here: Extracting Personalities from past Conversations? - #6 by curt.kennedy
For 1-token categorization, I and others have so many posts on this forum about it (use the search).
Embedding though, sounds like what you really need since you need to add your knowledge/data/facts to the AI. The latest comment I had on this was here: Can this api be used to query internal data? - #12 by curt.kennedy