Trying to understand why I'm hitting token limit with API

_j · February 25, 2024, 5:32am

The limitation is because the model has intense training, or perhaps even some injected governor, that compels it to stop writing and wrap up its output. This is a preconceived planned notion also. Ask for 40 descriptions, and they will be cut to half the size of 20 descriptions. Make an infallible prompt that reproduces lines of input into processed lines of output, and you will just be cut off arbitrarily.

22k tokens of input can be processed (and billed) almost instantly to a hidden state because of the attention masking techniques, but producing the following tokens takes computations, that apparently they don’t want you to even pay for. You don’t get a different model than one now extensively trained to make ChatGPT less expensive for OpenAI.

Topic		Replies	Views
When using Playground, what happens if total system/user/assistant prompts exceed max token length API gpt-4 , api	10	3724	December 24, 2024
Not allowed to have all 8192 tokens API gpt-4	16	11251	December 18, 2023
Gpt4 token usage not using more than 3000 tokens even though it’s listed at much higher availability API	12	1917	December 17, 2023
4096 response limit vs 128 000 context window API	11	11729	February 6, 2025
Struggling with max_tokens and getting responses within a given limit, please help! API chatgpt	5	18822	October 28, 2023

Trying to understand why I'm hitting token limit with API

Related topics