I am trying to create a custom gpt that always gives answers from documents, not from any other sources, and if the answer is not available in documents, it will say “Not in Documents.”.
I have a setup where Web browsing, code interpreter, and Dall-E image generation are not selected. And I have uploaded six files to the knowledge source.
I have assigned instructions as follows: “I’m here to provide answers directly from the documents you’ve provided or to clearly state “Not in the Documents” if the information isn’t covered in those files. My role is to focus on delivering precise information based on the documents, ensuring clarity and accuracy in the information shared. I’m restricted to using only the information within these documents without bringing in external data or context. If the answer to your question isn’t found within the documents, I’ll let you know by saying “Not in the Documents.” My goal is to assist you by sifting through the provided material to find the answers you’re looking for, making it easier for you to get the information you need without having to go through the documents yourself. In cases where a question’s answer is not explicitly mentioned in the documents, I’ll make it clear that the information isn’t available in the provided texts, and I won’t use any other information outside of the documentation.”
After all that, it’s not able to answer my question properly, and most of the time it answers “Not in the document,” where answers are available in documents.
I think the current formulation of your instructions is suboptimal. You are essentially repeating three times the same message/instructions with slightly different wording and in doing so seem to be overemphasizing the point that it should respond with “Not in documents” if the information is not available. You are basically steering the model towards this behavior.
The second observation is more general. What is the specific purpose or focus of your custom GPT? I think it would also help if you could narrow down the topic(s) it is supposed to handle and then link back what information in relation to the topic is covered in the different files you’ve uploaded. This will make it easier for the GPT to navigate through the documents and identify relevant information. At the moment it seems very open-ended.
Exactly like @jr.2509 mentioned, the instruction feel a bit repetitive and might be focusing the model towards answering “No Documents Found”
Another small tricks you could try →
If your documents are fixed, add some information about them in the prompt as well, a 1-2 line description that will allow a bit more of a targeted approach.
Spread out the prompt a bit, maybe like in a point like manner and try and not repeat the same instruction.
If possible, add a sample into the prompt, a question which can be answered using the documents in the KB
Lastly, just ensure that at the documents are atleast accessible by the GPT.
I had the same need, and I created it as follows (this is of course for a specific product I provided documentation for):
“This GPT assists users by finding and summarizing information from uploaded documents related to Dell EMC NetWorker version 19.10, a data backup application. It will extract key points, provide concise summaries, and answer specific queries about the content. When more documents are added for other products, it will adapt to include those as well. The GPT will aim to be accurate, concise, and helpful, ensuring that users quickly get the information they need without going through the entire documents themselves. All information in the documents is considered equally important. Communication will be technical and to the point. At the end of the answer, the source of the answer will be added”
At first it did not want me to end the answer with the name of the doc it found the info in, but it would give it if asked in the prompt right after the answer. Now it seems that that works too, automatically.
But I ran into another problem. I told it that I will upload docs for another product, I named the product, and told it that the new docs will have a keyword in the doc name. And that when I name that keyword, it should look into those documents only for the answer. This does not work. It fetches the answers from the previous docs (or maybe random, I did not test more than 2-3 times).
I ended up creating another customgpt with only the new docs, using the same instructions as above, that works well. I hope this helps