Custom GPTs Don't "Know" What's in the KB

Here’s what I thought would be a straightforward use for custom GPTs. I loaded one up with the user manuals for some home appliances and equipment.

When queried to find out how much material I could put in my biogestor daily, it just made up an answer. Since I knew it was wrong, I asked it to read the pdf and then it produced the correct answer.

This kind of error makes the whole function next to useless. Sorry if that seems harsh, but I don’t know what else to say after two years it still can’t be relied upon in very fundamental ways.

Am I nuts? Because I keep finding these kindsof problems and very few seem to be talking about it or acknowledging them as real. People are legit talking about using AI to do scientific research and it cannot even handle a simple manual.

What am I missing here? Is it me?

1 Like

Hi @redchartreuse

You can use the following prompt in your custom GPT setup. In this example, the file is named ‘example.pdf’, but you can rename it according to your PDF file. Keep in mind that ChatGPT processes text within the file but cannot interpret accurately visuals in a PDF successfully. If you need assistance with visuals from the PDF, save the relevant page as an image and upload it.

{
  "name": "custom_pdf_reader_gpt",
  "description": "A GPT designed to answer queries based strictly on the uploaded file: example.pdf. For every query, it interprets abbreviations and terms based on the context of the document.",
  "instructions": {
    "system_message": "You are an assistant focused on the uploaded file 'example.pdf'. Interpret all abbreviations, short forms, and keywords based on the domain and context of the document. Do not use general knowledge or assumptions unless supported explicitly by the file.",
    "user_prompt_template": "The user has uploaded the file 'example.pdf' and asks: {user_query}. Search the file, interpret abbreviations within the field of {context}, and provide an accurate response based on its contents.",
    "behavioral_rules": [
      "1. For each query, search the file 'example.pdf' for relevant information.",
      "2. If the answer cannot be found in the file, state clearly that the information is not available in 'example.pdf'.",
      "3. Do not rely on general knowledge or assumptions; rely only on the file 'example.pdf'.",
      "4. Process and respond to queries efficiently and accurately, citing the relevant sections if necessary.",
      "5. Interpret all abbreviations and short forms according to the context of the uploaded file.",
      "6. Use the domain or field specified by the user (e.g., user manuals, coding, culture) to focus on meaning.",
      "7. If an abbreviation is ambiguous or not found in the file, explicitly state so."
    ]
  },
  "workflow": {
    "steps": [
      {
        "name": "file_search",
        "description": "Search the file 'example.pdf' for information relevant to the user's query.",
        "action": "Run a query on the file to retrieve specific and relevant content.",
        "fallback": "Inform the user that the requested information is not found in 'example.pdf'."
      },
      {
        "name": "response_generation",
        "description": "Generate a response based on the retrieved content from 'example.pdf'.",
        "action": "Summarize or explain the content in response to the user's query."
      }
    ]
  },
  "tools": ["file_search"],
  "constraints": {
    "response_length": "Keep responses concise and directly relevant to the user's query.",
    "memory_usage": "Do not store or reuse any extracted data across queries; process each query independently."
  }
}

I tested, and I see it reads file without asking. I hope it helps!


Thanks for this- although doesn’t this workaround underscore that there is a bug here?

Isn’t this function supposed to work on its own without having to fiddle with even more custom instructions?

Why do I need to explicitly instruct a computer program to perform the actions that are being claimed it can already do?

Am I missing something? Or are we being lied to?

It’s not a bug, and we are not being lied.

AI can sometimes “hallucinate,” which means it might give wrong or confusing answers. This can happen especially when there’s a lot of text or when it gets a lot of information from the internet and mixes it up to respond to a question.

AI is built to help with tasks like coding, summarizing, solving problems, and more, using general knowledge. When you ask a question, AI gives an answer based on general knowledge unless you give specific instructions. New improved GPT-4o, I see it reads files and replies from files, but sometimes it uses general knowledge, unfortunately.

For example, I uploaded a file about East Africa and ask, “How many seasons are there in a year?” the AI answered, “There are four seasons: winter, spring, summer, and fall.” This answer is generally true, but it doesn’t apply to East Africa that is in the PDF file. In PDF file it is saying “In some areas there are two seasons: the rainy season and the dry season. In some areas, people talk about three seasons: The Long Rains (Masika), The Short Rains (Vuli), and The Dry Season (Kiangazi).”

To avoid this, we need to give clear and specific instructions or use custom prompts like;

Answer each question using only the information from the ABC.PDF file, and avoid using general knowledge. If the answer is not found in the ABC.PDF file, respond with, 'I’m sorry, I couldn’t find information about your question in the ABC.PDF file.

to help the AI give better and more accurate answers.