Gpt4 token usage not using more than 3000 tokens even though it’s listed at much higher availability

Hello, I am having issues with gpt4 and trying to make larger outputs. I use the full token amount “8,192 tokens” but it returns a server error. But when I use under 3000 tokens it is fine. I have tried very small prompts like “write a sentence” and it doesn’t matter it always returns an error. Is there some limitation on token count right now?

1 Like

also to clarify more i have tried using 8k tokens all the way dow to 3k tokens and nothing in between there works until i reach 3k tokens

Get creative. “Write 3 paragraphs for 5 random topics, then repeat and generate 5 more topics until you reach 50 topics”


Surely you can do better than to bump two-month-old threads with advice unrelated to the issue.

This is clearly that that the user didn’t understand that “max_tokens” as a response cannot go up to the model’s context length, because length is shared with the input during generation and is also required for the promping provided, or that their application’s chat history was also consuming tokens.
Actually catching the API error would likely provide the answer.

1 Like

the issue is that gpt doesn’t keep any state/memory, so if you’re looking to create a complex response, you’ve got to provide all the input system/user/assistant every time. Otherwise, I’d love to be creative and overcome this issue.

Surely you can do better than to bump two-month-old threads with advice unrelated to the issue.

You reply to someone that didn’t understand the initial concern, and answer neither the initial question nor improve on the misunderstanding of the poster you replied to.

Hi, as this topic seems to be active again,

How are you counting 3000 tokens? Which of the GPT models are you using? Can you please include a code snippet of the API calling code and any setup it relies upon.

For clarity, due to the small overhead used by the API and other internal systems, it is advised to treat 1K in this context as 1000 and not 1024. Also if you wish to count tokens accurately, please use the tiktoken API which can be found here : GitHub - openai/tiktoken: tiktoken is a fast BPE tokeniser for use with OpenAI's models. and you can text your text token counting length with the newer cl100k_base model here Tiktoken Web Interface cl100k_base

Yesterday was my first day on here, so please cut me some slack. You might have missed your calling as a litigator, which is part of what I do for a living. Are you an admin, or do you work for Open AI?

I just found humor in my prior response about someone bumping an old thread being immediately applicable again…

You can see my answer provided earlier: the symptom of the asker four months ago is similar to your recent issues in confusion over the use of max_tokens.

I, like all here, am just an enthusiast. For those browsing these topics in the future, it is useful to point out “not an answer” answers.

I appreciate the gratitude you’ve given in the other thread. The only compensation I get for sitting down and typing what couldn’t be ferreted out of documentation is knowing someone was helped.

Likewise, I want to see others helped usefully.

If there is terseness perceived, it is that I don’t assume myself an expert, only experienced, while you might actually be the expert in certain areas. Approaching the conversation as to another programmer and developer similar to myself can also mean assuming you don’t need basics or background for understanding, basics which could instead be seen as patronizing or condescending.

The AI doesn’t see any “broken English”… chat share

Here’s an emoji to let you know we are comrades :grin:

1 Like

Thanks, I appreciate your help yesterday.

1 Like


I am a Community Champion, I help champion the needs and pain points of developers so that OpenAI can optimise their time and resources. I do not work for OpenAI, I consult for clients who wish to understand how they might be affected by AI, and OpenAI’s services in particular.

Consider your slack, cut.

1 Like

Thanks for your query. I thought that max_tokens included the user’s prompt and the response b/c of how the OpenAI docs define it. J informed me that it was just the output, so I wasn’t allowing enough context length for the prompt input. This was accurate and I was able to resolve it by resetting the max tokens parameter in the api.