How to implement the similar feature like chat with uploaded file feature in ChatGPT

pretendlake · February 20, 2025, 8:55pm

You’re on the right direction, but it works a bit differently, let me explain:

The pdf gets split into small text chunks (like a line from a paragraph)
Each chunk is converted to an embedding and saved in a vector database along with the text chunk

To better understand the purpose of embeddings and how to use them, the documentation explains it very well:
https://platform.openai.com/docs/guides/embeddings

When a user sends a message, this message is also converted to an embedding
The vector database is queried to find embeddings which are the closest in distance to the user message embedding (meaning the vector is semantically similar to the user message)
The text chunks from the most similar embeddings are passed as context to the GPT as a system prompt

TLDR: No, the whole pdf is not being sent as context on every message. Instead, by making use of embeddings, we identify the parts of the pdf that are related to the message and pass only those parts as context.

This technique is also known as RAG (Retrieval Augmented Generation)

Topic		Replies	Views
Replicating ChatGPT's behavior of attaching a document using OpenAI API API chatgpt , api	3	651	July 22, 2024
Answering questions about text file content API	5	8981	December 15, 2023
What is the API equivalent of uploading a PDF? API gpt-4o	1	4795	June 20, 2024
I want to upload pdf file in chatgpt and ask for a summary of it in one go? API	3	2072	June 10, 2024
Seeking Advice: Uploading Large PDFs for Analysis with GPT-3 API API gpt-35-turbo , chatgpt , fine-tuning , api	7	7043	December 13, 2023

How to implement the similar feature like chat with uploaded file feature in ChatGPT

Related topics