- Maybe a bug for me, need some testers.
Instructions for gpts stop being followed by my gpts at 750 words, 4600characters. 20 less and it follows them, 20 more and it acts completely different almost like a default chatbot.
- Limitw on knowledge files
Not sure exact details but gpts knowledge has limits similar to 3rd party bots. Max 10 document uploads. Max file size 500mb each i think. And it seems there is a context window problem in retreiving excessively wordy files, i dont know the exact limits but 3rd party bots typically dont work if there are 50,000 lines of text in a single document would make them ignore the files all together.
- im interest to know how many words or tokens we can have in a document and it be used properly so we can max it out
Im no coder but i had an idea to fix the document size problem, which would be to have an gpt a forward your prompt to gpt b, who finds 5k worth of info in the documents specific to the users request, passes it back to gpt a and gpt a searches and condeses the points in that with its limited context window. This can happen multiple times in order to retrieve info from large documents
Or, python can do loads of things with text using APIs but what if instead of APIs u had a chatbot help it so that u can basically extract info from anywhere with code already ready to go straight into chatgpt succinct and distilled in a form 1/3rd of the size and 5x as dense in info so it can concentrate on retrieval instead of search and wasting all its tokens in the process. Maybe a big company with a supercomputer could run the model all over the net to start a kind of internet number 2 of condensed info for education?
Context window explained.
The new gpt 4 preview 128k has obviously 128k input as expected but what many dont know is it still only has something like a 4 to 8k output.
We thought the 128k window would solve the problem of not being able to write full length articles or summarise research papers but no, because the output is so low the 128k context window really only helps in gpts ability to understand more for its tiny answers.
I really hope they recognise that we want the extra output and will need it to actually create better datasets for training better models cooperatively