Request too large for gpt-4o in organization

web11 · October 9, 2024, 8:27am

Hello, I’m using the API and I’m being hit with the error ‘Request too large for gpt-4o’
My token limit is 30K, but I’m getting this error in threads with 4-5 messages and I don’t understand where those 30K tokens could be coming from.

Full thread:

Even when counting the prompt and all the documents I’ve given it as tokens, they amount to 8,777.

So where is the 30K coming from? Is it just typing a really really long answer?
In the prompt I tell it multiple times to keep its answers brief, and based on other times its answered what I’m asking, the response should be around 100 tokens.

Thanks

_j · October 9, 2024, 8:43am

Every document search is another API model call done internally, then the backend calls the model again after adding the document results to the thread with all your past chats, that thread with assistant calls to tools and assistant responses to you, the tool definitions, the instructions and additional_instructions, the past document retrievals - lots of tokens…and maybe the AI still wants to call another search.

This is what burns up your 30k token-per-minute rate limit at tier-1, with you paying for everything up to not getting the final response.

gpt-4o-mini is given a higher limit for such testing. You can reduce the chunk size of vector store documents, and reduce the number of chunks returned, and also use a similarity threshold on the file search.

There will simply be threads that you cannot interact with any more, though, especially without the controls only available through API calls. You have to have paid up $50+ ahead of time after a waiting period for tier elevation to get useful rate limits. There’s a link you can click.

web11 · October 9, 2024, 8:54am

I see I didn’t know having added files used up that many tokens, Thanks a lot!

Would you happen to have any links on how to reduce chunk size of vector documents?

_j · October 9, 2024, 9:14am

“documentation” on the sidebar of the forum allows finding the direct link.

https://platform.openai.com/docs/assistants/tools/file-search/customizing-file-search-settings

Topic		Replies	Views
Does OpenAI not chunk my documents in vector store? API gpt-4 , assistants-api , vector-store	1	261	November 11, 2024
Assistant API - way too much "input" tokens used API assistants-api , assistants-pricing	7	4911	September 6, 2024
Understanding AI Assistant input token counts Prompting gpt-4 , lost-user , assistants-api	5	3412	June 26, 2024
Why is token use so high when using file search? API	4	1500	May 7, 2025
GPT-4o Assistant Thread Length Limit? API playground , limitations , threads , assistant , gpt-4o	9	11096	July 19, 2024

Request too large for gpt-4o in organization

Related topics