Output length of gpt-4o and gpt-4.5 is far below expected for large input

Sean24 · March 6, 2025, 5:05am

Hi there, I have an input of 20k tokens. I want 4o (2024-11-20) / 4.5 (preview-2025-02-27) to rephrase in a certain structure for me. The max output length is set to 16k, and I expect the real output is at least 7k. However the output is always around 1-2k, no matter how I encourage more output in the developer message. What is the reason, and how should I solve it? Thanks!

I expect solutions other than breaking input into pieces. Because I am wondering why output length in a single run is far below the max-token, given input + max-token is also far below the max context size. Does it mean the nature of gpt model leads to an underlying “true” max output length, regardless of the parameter we set? In other words, does it mean, even the task does not require smart intelligence, when the input is large (say, 45min interview or 2h seminar transcript), we have to use the o-series model? (but truth is I find gpt-series outperform o-series for short input in my task, so switch to o-series is not a perfect solution to me in large input, instead a compromise)

Valliam · June 9, 2025, 12:36pm

I have the same question. Is there any way to solve it?

_j · June 9, 2025, 1:06pm

OpenAI is really crushin’ it - in terms of crushing down the output length the model will produce. Right at about 1700 tokens.

You can try o3-mini and see if it hasn’t retro-damaged.

Or try Gemini Flash 2.5, which will write 10x the length without hesitation.

OnceAndTwice · June 9, 2025, 1:44pm

The huge flashy context window always seemed to be for input. If OpenAI’s training dataset never has the AI producing that many tokens, then you’ll never be able to prompt it into doing that. You might consider segmenting your input and having it process each one at a time. This will cost more, but you’ll benefit from input caching.

You could also try fine-tuning. With a good dataset, you may even be able to drop to a “mini” and save on inference.

Topic		Replies	Views
How to print the output over 10,000 tokens? API gpt-4o-mini	4	823	September 9, 2024
How do I get gpt to throw out more tokens in API? API gpt-4	3	2094	December 16, 2023
How to force openai to print out output larger than 3300 symbols? API gpt-4 , gpt-35-turbo , api , assistants-api	5	1589	March 7, 2024
GPT-4o-mini max token 16,384 API gpt-4 , api	2	1991	August 31, 2024
New to the API – Getting Empty or Cut-Off Responses? API gpt-4	1	97	June 20, 2025

Output length of gpt-4o and gpt-4.5 is far below expected for large input

Related topics