Custom GPT Knowledge Versus External Actions

I created a Custom GPT for internal use at our organization and for the most part it works well, but I’m running into some reliability issues with its knowledge. The knowledge of the Custom GPT consists of two files that include summaries about items available in our products. One file is a relatively small Word document (around 40 KB) and the other is a much larger Word document (around 300 KB).

The GPT always provides accurate information from the small file, but it tends to have trouble finding information in the larger file. Most of the time it can only find information mentioned near the beginning of the large file and says that it cannot find the information that is mentioned at the end of the large file.

Interestingly I can force the GPT to produce accurate information by prodding it during the conversation with statements like “Please try harder”. I also noticed that the GPT produces accurate information only when I see the spinner labeled “Searching Knowledge”.

This makes me think that OpenAI is trying to avoid invoking the retrieval function by default and expends the extra computation to do a full retrieval only when the conversation demands it. I tried various prompting approaches to force the GPT to always use its provided knowledge, but I wasn’t able to make it work consistently.

Our internal users are not experts at prompting and the current user experience is rather poor and misleading.

I personally don’t see a way around this problem using the built-in retrieval function; so, I was wondering if you think that implementing our own retrieval function and exposing it as an action to the custom GPT would work better.

We would like to eventually productize this feature using the Assistants API, so I wanted to go as far as possible prototyping the behavior of the assistant using Custom GPTs before investing development time to build a full custom solution.

1 Like

Losing attention with big prompts is known to happen.

Have you tried breaking down the large file into smaller ones?

I hear they’re working on attention with bigger contexts, so it should improve eventually.

Thanks PaulBellow for your suggestion!
I agree that part of the issue may be related to attention as I’m not running into this issue when I configure the Custom GPT with only the smaller file. I plan to try breaking down the knowledge files into smaller chunks as well in the near future and see if it works better.

However, I suspect there may be something else at play here. The Custom GPT can in fact answer questions on the larger knowledge file correctly, but only when I see the loading spinner saying “Searching Knowledge”. That never seems to happen on the first prompt to the Custom GPT. Hence, my hypothesis is that Custom GPTs work as follows:

  1. OpenAI summarizes the content of the knowledge base into a compact form that it feeds to the model as context when it is prompted for the first time (no true RAG)
  2. if the user asks explicitly for more information, then it uses a RAG pipeline to extract more detailed knowledge from the files and provide more accurate answers
    This may be a good compromise for most users, but it is a poor user experience for what we are trying to do.

What I’m wondering is: if I provided my own retrieval function through an API as an action will the Custom GPT use it consistently every time or will I run into the same type of issue?