Not enough tokens error, even though I've paid A LOT (maximum context length error)

I’m posting this in the developer forum out of exasperation with OpenAI’s customer service.

My interaction with OpenAI customer support has been beyond terrible. I understand it’s a startup, but I hope they’ll devote some time and attention to making customer service work for their customers.

Also, their customer service bot ends the conversation upon uploading a screenshot. So, it’s actually impossible to attach screenshots or provide documentation (see the below screenshot).

Basically, I’m getting an error message shown in screenshot 2 that I’m lacking about 3k tokens to complete the request. However, the OpenAPI website (screenshot 3) says I have $50+ balance, which is more than enough tokens.

Due to OpenAI’s customer services’ inability to understand the issue, I’m going to try my best to over-communicate and over-document. The problem is that OpenAI fails to deliver the services for which I’ve paid. I’ve tried my best to explain to them that I have paid OpenAI. But, they seem unable to comprehend that despite the fact that OpenAI’s been charging my credit card for weeks (see screenshot 1) and the OpenAI website says I’ve paid screenshot 3.

And my frustration continues, as ‘new users are only able to post one screenshot.’ I don’t understand why OpenAI seems so determined to make it impossible to report issues and bugs!!

Welcome to the forum.

For the model you’re using, the prompt + completion can only be 4097 tokens, and you’re sending 7k… either change to a 16k or 32k model if you have access or make your prompt shorter…

Good luck!


Resources to understand what you are doing wrong:


The 4097 token length is a technical limitation for the gpt-3.5-turbo model. The prompts plus the response cannot add up to more than that size; there’s not enough space in the model. It doesn’t matter whether you pay money or not.

If you want to make inference with context and answers that are bigger than this, you need to use a model that has more context size. gpt-3.5-turbo-16k is one option; gpt-4 is another. Both of them are big enough for 7442 tokens.


Thanks for the recommendations in this thread, I too have been looking for a solution to the problem described by the topikstarter for several days. Now it became clear to me how to solve this problem.

You need to use a model with a large context size: gpt-3.5-turbo-16k and write in the code instead of: “max_tokens = min(max_tokens, 4096) # OpenAI token limit” a new parameter: “max_tokens = min(max_tokens, 16 385)”.

No, incorrect. max_tokens is how much response you want to get back.

Or if you are unable to figure it out, just omit the parameter entirely.

1 Like