Problems with knowledge base

I’m working on a custom GPT with knowledge base containing a single json file with ~50,000+ lines. When I was asking questions, I faced quite a few problems.

  • I always had to make multiple attempts to get the right answer. This resulted in extreme unnecessary engineering to prompt instructions.
  • Many times custom GPT is just stuck in searching knowledge to eventually error out.

Workaround:

  • I broke the json file into 2 files and removed all the unnecessary prompt instructions. After that custom GPT was able to answer questions with much clarity and better understanding of knowledge.

Question:

  • I’m very curious to understand what happens to the knowledge files after uploading. will they be broken down in some random way?
7 Likes

Internally a vector db is used to perform retrieval augmented generation and based on various checks it looks like they use Qdrant. It’s not clear though what chuking algorithm they use etc.

I see, that’s interesting. I’m having hard time to work with json files with all the Network errors, Loosing connection to knowledge base frequently.

I’m experimenting with flat and hierarchical json structure it seems it has problems in both cases to interpret the data correctly. It was able to point out or summarize details about a specific property or type but for some reason when asked questions like “list out all the properties under a particular type” it blurs out and eventually says “I don’t have that information in my knowledge base.”