GPTs knowledge capacity limits

dan.brazil · November 12, 2023, 1:10pm

My frustration is that each time it is going back and ‘Searching my knowledge’ and that is taking anywhere between 5 and 15 seconds. Painful after a while. I have been playing with instructions to stop it but not always working.

I have a relatively small text file. 117kb

Any tips on how to optimise your source file. I have a 60 page doc in Google Docs that is my ‘Master’ and from that I export a .txt file.

|Pages|58|
|Words|14887|
|Characters|105274|
|Characters excluding spaces|91784|

Played around with GPT4 ‘regular’ to optimise the file and it can do some stuff that looks clever with a prompt but not always do I get a result and mostly it fails analysing:

Please process the attached text file for AI chatbot readability. The text should be cleaned of any headers or footers, page number and segmented into clear, readable sentences or bullet points. Don't remove any of the content. Pay special attention to maintain any URLs intact, as they are essential for the chatbot's reference. Format the output into a text file optimized for the chatbot's understanding.

Be good to understand what others are thinking when it comes to KB optimisations to avoid lengthy ‘Searching my knowledge’!

Best practice on formatting your source doc would be good as I’m sure a good structure will deliver a better prompt to output speed.

Thanks

piotr.cieluchowski · November 12, 2023, 6:54pm

Thanks for this share. Nice idea

bkerryk · November 13, 2023, 1:10am

Am I wrong in assuming that it shouldn’t matter how large a file is as it chunks the data embeds it for vector search (Knowledge retrieval). It has to be the same functionality as the Knowledge retrieval tool for the Assistants API.

The whole point of this is that it doesn’t overload the context and it can semantically search the uploaded documents with ease. I have created my own before assistants came out and its super fast to search.

I’d assume that OpenAI would of optimized chunking and retrieval and maybe even made a better Embedding model (high hopes) for their own Knowledge retrieval product…I feel that it might be broken right now to be honest, i just posted about this and its acting super weird.

felixjz · November 13, 2023, 4:01am

I agree! The human experience is not replaceable at all, It seems to me our virtual assistants are getting better!

timpietrusky · November 13, 2023, 3:43pm

For me it seems, that with GPT Builder + Knowledge, it will load the file on the fly and only reads parts of it (I’m using a markdown-file with 13k lines). In my tests, it will sometimes only read the first 500 lines and then it says, that it couldn’t find the information I was looking for. In other scenarios, it will read the full file, but tries to find the keywords the user was looking for:

Prompt: Please show me the code for classic perlin noise from lygia

# Let's read the content of the uploaded file to find information about Classic Perlin Noise.

with open('/mnt/data/lygia.md', 'r') as file:
    lygia_content = file.read()

# Search for the section about Classic Perlin Noise in the content
perlin_noise_section = None

if 'Classic Perlin Noise' in lygia_content:
    start_index = lygia_content.find('Classic Perlin Noise')
    end_index = lygia_content.find('\n\n', start_index)
    perlin_noise_section = lygia_content[start_index:end_index]

perlin_noise_section

This approach doesn’t make a lot of sense, as I don’t want it to do a exact-text-search, but instead convert the query of the user into a vector and search in a vector store for the content the user is looking for. I was expecting, that it’s working like this.

michael.finocchiaro · November 13, 2023, 3:53pm

I tried to upload just 17 smallish PDF files and I got “unable to save” errors until I deleted the number down to 10 or less. Is this normal? What are the limits for a Custom GPT?

Foxalabs · November 13, 2023, 3:58pm

Hi and welcome to the Developer Forum!

Yes, 10 does seem to be the file count limit.

michael.finocchiaro · November 13, 2023, 8:35pm

Thanks! And what is the upper limit on the file size? And is there a limit to the combined size of the 10 files?

I have another related issue that when folks talk to the GPT and it tries to give citations from my PDFs, it shows an error “Malformed citation 【Circular Economy.pdf†source]” (the PDF file was Circular Economy.pdf). Is there a way to provide a URL from which I generated the PDF? Because when I tried to feed it the URLs of my articles, it said that it was unable to do so.

jossif · November 13, 2023, 9:13pm

Try uploading a ZIP file with all your files. It may work. You can do a quick test by ZIP’ing a few files and ask GPT-4 to tell you what is in the zip file

Foxalabs · November 13, 2023, 9:15pm

Not sure on the size, I can make a guess of 256MB from the fact that the Assistants API allows 20 files of 512MB and the GPTs allows 10 files… at (maybe?) 256MB so basically half… total guess though.

BogoJoe · November 13, 2023, 9:29pm

you say the same thing over and over, we heard the first time😭

Foo-Bar · November 13, 2023, 9:44pm

I solved it by sending a query from GPT through the API to my server.

The server searched in the SQL, returns data with context and GPT creates a response for it.

felixjz · November 14, 2023, 5:58am

yes, i think they might expand the 10 files cap, btw if you try to upload small txt after the 10 files cap it seems you can fit a couples more files as well compressing the pdf’s helps a ton reducing the pdf’s size by a 50%

scottdunn · November 14, 2023, 2:46pm

Can you tell us more about using an SQL database for this type of application? It seems most are using a vector database for this type of operation, but perhaps that isn’t really needed in some cases?

SECourses · November 14, 2023, 3:12pm

What are the token character limited?

I did upload a PDF file around 200k characters and 28k words

it is about gradio documentation

single file

paul.fishwick · November 14, 2023, 6:08pm

I agree on the 10 file limit. Cannot get beyond it. Since Wikidata is not yet a GPT, I looked at the latest Wikidata dumps. There is a gzipped 12GB file but I imagine that is too large, and ChatGPT states that it is not trained on Wikidata.

kog8790 · November 14, 2023, 6:29pm

Could you upload your files to a personal website and point your GPT to that website to search for the information?

brisklad · November 15, 2023, 12:33am

There is a workaround! You can archive your files and upload them as a zip. Enable Code interpreter. Next give such a prompt: “I have enable Code Interpreter, unzip filename and deeply analyze all the data you will find and store it as a knowledge and update GPT”
There is one problem. It will summarize data from your files, so it’s not storing it in full, probably because of a memory limit. Enjoy.

curtismrasmussen · November 15, 2023, 2:03pm

Yes, that’s my experience. 10. I’ve done all pdfs and all .txts. Had generating errors with all pdfs.

But, if you mix and match .txt with .pdf I found the only balance that worked for was 8 .txt and 2 .pdf. Any increase in pdfs it crapped out over and over.

proaxmarketing · November 15, 2023, 3:43pm

Hey, @brisklad is there any limit for zip file?

Topic		Replies	Views
How to best use GPTs with PDF files? Plugins / Actions builders plugin-development	15	19265	December 28, 2025
My GPT - Knowledge base - Best practices GPT builders	8	24617	December 28, 2025
Error saving draft when building a new GPT Plugins / Actions builders plugin-development , api , custom-gpt , chatgpt-gpt	174	46411	January 5, 2024
Who has had success with adding many/or large documents to the 'Knowledge' section? Plugins / Actions builders gpts , gpt , mygpts	14	9718	January 6, 2024
Knowledge file upload limitations GPT builders	4	1529	April 3, 2025

GPTs knowledge capacity limits

Related topics