Understanding the Algorithm Behind ChatGPT's Custom GPTs and Improving RAG Accuracy

ptrader · January 10, 2025, 8:27pm

I have a ChatGPT account, and I created a private custom GPT by uploading a few PDF files. The resulting accuracy is excellent.

However, when I use OpenAI’s API, perform chunking, and create a RAG system based on the same PDF files, the accuracy of my RAG system is far lower compared to OpenAI’s custom GPTs.

Is there a way to find out how ChatGPT creates a RAG for custom GPTs so that I can replicate something similar?
What is the algorithm or process behind ChatGPT’s custom GPTs?

Thank you,

arata · January 10, 2025, 11:31pm

ChatGPT works similar to the description of Assistants’ file search.

https://platform.openai.com/docs/assistants/tools/file-search

The AI has a tool it can call with a search query, rather than embeddings being run on user input or input context.

The return format, placement of ranked chunks, is not disclosed, but one thing ChatGPT reserves for itself is the use of source file names, such as giving the search file names available (which are of limited count in ChatGPT) and also file names where chunks were returned from.

The first step is ensuring you have high-quality document extraction. PDFs are not great for obtaining a text input format for AI comprehension, and companies are built on doing this.

ptrader · January 13, 2025, 4:53am

Thank you so much @arata for referring me to the file search.

exactly what I was looking for.

Topic		Replies	Views
RAG and Custom-GPT vs Chat-GPT GPT builders chatgpt	5	2147	December 17, 2024
How does the knowledge of custom GPT actually work Documentation chatgpt	7	16325	December 1, 2023
CHATGPT API with 200 massive PDF files API	5	1276	December 14, 2024
How to achieve ChatGPT-level PDF parsing with APIs? API pdf	2	3871	August 27, 2024
What is the current rag architecture of openai for pdf uploads? Community gpt-4	2	901	July 24, 2024

Understanding the Algorithm Behind ChatGPT's Custom GPTs and Improving RAG Accuracy

Related topics