How can I train my own chatbot on my own data, so that the chatbot will answer with knowledge of a lot of context? Because with ChatGPT API (3 turbo), if you give it even a small amount of information, it may not remember what we started with and give inaccurate responses.
If you’re tech savvy you can check LangChain. With different loaders, such as directory loader / URL loader / Notion / many others, you can pull proprietary data and work with them later on.
With the new GPT-4, you should be able to provide a lot of information in the prompt window so it should make the basic approach much easier.
In case you feel like doing some techy work, you could build a context generator using a sentence transformer matcher on your document and give that context to GPT to answer the original question.
for training there currently no option. But you can transmit some information prior to the user request and thus get some answers.
But I have a Idea that might help you with your requirement. You can train the other LLM’s like davinci with your data. So you can use this feature to do the following:
- Train Davinci with your information
- Let the user ask his question, transmit this over to Davinci and get the answer. Don’t submit this back to the user but instead via your Backend submit this together with the User-Request to the Chat API. Feed this answer back to the user
So basically to intertwine the to API’s together to provide full functionality. The downside would be a higher token usage because you are doing some additional steps. But because you mensioned your using this for yourself I thought it might be a viable solution