GPTs knowledge capacity limits

Great!This method really works and can break through the limitations. I uploaded three files totaling 350m to the GPTs repository. But it is unclear how much impact it will have on the accuracy of Q&A

1 Like

I am amazed how I’m seeing these sorts of statements after Dev Day. This issue was resolved some time ago. You can still use the chat completion API with the existing models to accomplish exactly what you want.

It seems to me that the GPT and Assistants API technology was not designed for your use case. It’s like trying to fit a square peg into a round hole.

This is how you can do it today: https://www.youtube.com/watch?v=Ix9WIZpArm0&ab_channel=Chatwithdata

I suspect at some point in the future, you will be able to do it with the Assistants API. But it doesn’t look like you can do it today.

1 Like

Max 10 files can be loaded.

Size restriction: 512 MB per file.

For Images: 20 MB per file

Additionally, there are usage caps:

Each end-user is capped at 10GB.

Each organization is capped at 100GB.

This blog from OpenAI gives more information:

Blog from OpenAI giving limit on file upload

3 Likes

Anything beyond 10 files leads to “Error Saving GPT”, this needs to increase with or without assistance.

What use case do you have that 10 huge text files can’t meet your needs?

We are developing a bespoke GPT system tailored for our high-level SLA support and development team. This system will be integrated with a comprehensive understanding of our unique source code, knowledge base, feature request documents, and meeting minutes. This integration will empower the support team to swiftly access pertinent information, taking into account the context derived from these resources. While the individual files are not significantly large in size (max 100kb-2 MB each), there are many of them. hundreds of 100kb to 2MB files.

If they’re small maybe it makes more sense to use actions or functions to call the files you need for the context? That sounds like a very advanced project, unfortunately these beta-stage tools might not be up for the task yet.

2 Likes

I attempted this solution, but encountered a bug. I’ve shared the problem on Discord, but so far, I haven’t received any advice on resolving it.

GPT Actions where the API, Manifest file, set up for POST requests on a specific endpoint, is receiving GET requests from OpenAi .

UPDATE: this bug seem to have been fixed now.

This sounds like something that needs building from base API parts into a customised application, I do not see an easy fit to any of the no-code solutions currently offered.

1 Like

Not that this is y’all in this chat, but a lot of no-coders finding out they need to become coders. Huge learning curve though, IDK how people successfully wade into this industry without years of experience already under their belt. I think tackling these large data-retrieval projects could be super useful to practically learn what one would need to develop a successful program, but it’ll take a loooong time to get it right if one starts from the basics. Doable perhaps with GPT-4 helping along… Be patient and learn python poetry and pypi, incorporate a little Aider and AutoGen or Assistants, and then POOF we’re cookin’ with that db.

3 Likes

A lot of no-coders finding out they need to become coders.

I’m one of them lol. I only knew Scratch and Lego Robotics 3 months ago, now I’m building dashboards and basic interfaces to interact with the API through as a sloppy replacement for GPT-Enterprise.

All with the help of GPT-4!

Obviously tons left to learn, but I’m leading the AI revolution within our company.

I love these things :grin:

6 Likes

I’m with you on this and following the answers that come. Thank you for your post

1 Like

I as well realizing I’m needing to Learn Python TSS and JavaScript. Definitely late to the game but I own an IT business so I need it I also have several patents that I need an assistant for as well as several other chatbots that I need to create I will be extremely useful to people as well as trying to learn how to monetize this. LOL prayers for all of us

1 Like

Several of my GPT’s still will not draw on all of the knowledge contained within the supporting documentation that I’ve provided. I would publish the GPT for everyone to see, but it’s not going to happen for now. I have uploaded all PDF’s and they are all well under the limit.

Is there a specific prompt or a means of logistical organization that would encourage the GPT to search all of the docs? I can ask it specific questions and it will tell me that it can’t find it, while I’m simultaneously looking at the information in the PDF.

Frustraing.

Did anybody have success with adding large (> 1000 entries) table, spreadsheet or similiar data structure to GPT? Some other format? Json? Whatever works.

No luck so far, GPT often ignores the data altogether, and after “searching knowledge base” find some fragments.

I am interested in giving it a try, I am just starting an AI company.

Have you tried using Papr Memory GPT, its unlimited you can add as much data and retrieve it to use it inside ChatGPT

https://chat.openai.com/g/g-KDTLacn4M-papr-memory

https://memory.papr.ai

If you are interested to build your own GPT we can also provide you with API keys (wip) so you can use Papr Memory with your GPT

If we’re talking Custom GPTs for regular subscribers GPT says 5 files of 10mb each. I’ve yet to test it. That’s a question you can ask when building the regular subscription GPT.

If you have a bunch of text, what is the recommended format for uploading as Knowledge to the GPT? I have uploaded as simple txt files. However, the accuracy and speed of the GPT in retrieving that knowledge is spotty.
My assumption is that properly implemented, the same info that can be found publicly by Bing web searching by the GPT can be answered faster by uploading as knowledge. So far, that has not held true.

I uploaded several large Word documents to my GPT and asked a few questions that require GPT to search in more than 1 document and combine the search results. GPT could not find relevant information: “After searching through the provided documents, I was unable to find specific information on ***. This suggests that the details you are looking for might not be contained within these documents, or it might be referenced under a different terminology or section that wasn’t identified in the search.” If the question can be answered based on only 1 document, GPT seems to work fine.

Before GPTs, I tried the same documents and questions using OpenAI API, Pinecone vector db, and Langchain retrieval. The index contains about 70k embeddings, and each embedding is generated from a few hundred words. It is not very satisfactory but it kind of works. At least it returns something every time. It seems GPTs still has a long way to go.

2 Likes