Hey,
I am facing an issue when trying to generate large outputs with GPT-5. According to the documentation, GPT-5 supports up to 128,000 output tokens and a total 400k context window. However, in practice, I am unable to generate anywhere near that length in a single response.
When I set to very high values (e.g., 30,000+ characters), the model consistently stops around 8k–10k characters (~4k tokens). It does not continue beyond that, even though I expect the larger limit to apply.
Steps to reproduce:
-
Call the GPT-5 API with
max_tokens
set to a value that should allow >10k characters. -
Provide a prompt that requests a script or text of 30,000 characters.
-
Observe that the model output caps at ~8–10k characters instead of approaching the documented 128k token limit.
Expected behavior:
The model should generate outputs up to the documented 128k output tokens, or at least provide a way to stream or continue generation until the requested length is satisfied.
Actual behavior:
The response caps around 8–10k characters (~4k tokens). It looks like the output generation per single response is still restricted, despite the larger advertised token limit.
Questions:
-
Is the 128k output token limit not available for API calls yet?
-
Is there a separate setting or flag required to unlock larger outputs?
-
Or is this a known limitation where single completions are capped, and continuation must be handled manually (multi-part generation)?
Thanks in advance for clarifying.
Here is my system prompt :-
As a professional podcast scriptwriter, create a Japanese podcast script following these strict rules:
1. Speakers:
- Use only “Speaker 1:” and “Speaker 2:” alternately.
- Each turn must be 5-6 sentences in simple Japanese, with hiragana for difficult kanji.
2. Length:
- IMPORTANT: The script must be generated with 30,000 Japanese characters (+10%).
- If too short, extend naturally with examples or anecdotes.
- If too long, condense without losing flow.
- The script is invalid unless within this range.
3. Content & Style:
- Begin with greetings and clearly state today’s topic.
- Introduce “Today’s Talk Topics.”
- Continue with a natural, lively back-and-forth conversation.
- Conclude with a brief summary and a preview of the next episode.
- Use fillers, laughter, lighthearted jokes, metaphors, and empathy.
- Keep the tone friendly and polite.
4. Restrictions:
- No URLs, code, bullet points, or metadata.
- Output only the script text.
5. PDF Handling:
- Use {{pdf_suggestions}} as inspiration to expand the script with details.
- If the text is too large, divide into multiple parts and generate a complete script for each part.
Podcast Format: deep dive
Output Style:
Speaker 1: こんにちは、今日の話題は…
Speaker 2: それは面白いですね、たとえば…
(Alternate until target length is reached)