Hello, fellow ChatGPT developers. I have a question about fine-tuning. The use case is to create a ChatGPT bot that is restricted to using a specific document for its answers. The document could be 10, 20, or even 50 pages long, and I want the bot to base its responses solely on the information in that document. I don’t want it to make up answers, hallucinate, or rely on any other knowledge it has, but strictly adhere to the document, while remaining accurate.
I’ve made custom GPTs before, where I uploaded a document or some text, but they didn’t work as well as I hoped. Is fine-tuning the correct process to ensure the bot only references this document and doesn’t try to invent or guess answers? If there is a better tool for this, could you let me know?
Also, if fine-tuning is the right tool, how much effort or work would be involved in getting the bot to strictly follow the document’s knowledge without incorporating outside information?
Fine-tuning is typically for behavioral changes. Sure, over time by tuning the model it would “learn” new things but this is a daunting task that leaves you with a black box model that’s difficult to mutate.
What you want is RAG. So you’ve tried it using the general-purpose solution provided by OpenAI and it didn’t work. That’s okay. It’s a general-purpose solution and sometimes won’t work to expectations.
You need to dig deeper on the concepts behind RAG. Embeddings, vector databases and understand “why” it’s failing. The OpenAI RAG solution in itself is very black boxxed & hard to debug.
So, if you’re serious about it I would recommend whipping up a nice free vector database/knowledge graph/whatever you wanna call it, and start experimenting
Embeddings with retrieval is best suited for cases when you need to have a large database of documents with relevant context and information.
By default OpenAI’s models are trained to be helpful generalist assistants. Fine-tuning can be used to make a model which is narrowly focused, and exhibits specific ingrained behavior patterns. Retrieval strategies can be used to make new information available to a model by providing it with relevant context before generating its response. Retrieval strategies are not an alternative to fine-tuning and can in fact be complementary to it.
You can explore the differences between these options further in this Developer Day talk:
It’s actually simpler than the explanations make it seem. Here is one link I thought was better than average: NVIDIA: what is RAG
Scan down to “What is RAG”.
You can implement it using LangChain and I’ve done that, but I prefer doing it with my own code since LangChain is a black box that I can’t understand or control as well as my own code.
I’m kind of out of the main stream since I did it in C#/.net. using freely available nuget packages (check license though) and a sqlite database for the chunked and indexed local documents.
Hey, I get what you’re trying to do! Fine-tuning could help add more context to your model, but it probably won’t keep the bot strictly tied to your document. It might still pull in other info or “guess” responses. A better way would be using embeddings with a retrieval-augmented generation (RAG) setup. That way, the bot fetches answers directly from your document without relying on external knowledge. It’s easier to manage and more accurate for what you’re after. I hope that helps!
UMMM, call me a fool, but I have been doing this for over a year. I have built a GPT that will pull directly from the documents and even ask pen-point questions from them. My question is, how do I deploy them as my own GPT or sell them