I’ve been using the new Responses API successfully for a few weeks now. Today, I am seeing responses that are long sequences of \t\t\n\n \t\t\n \t\t\n \t\t\n with no words - where there should be text from the assistant’s answer. Its seems totally intermittent. Identical requests are working sometimes and others producing that response.
Any ideas to resolve this? Anyone else with this problem? Using gpt-4o
This is a symptom that is most commonly seen when using { "type": "json_object" } as the type of text output format, instead of providing a strict schema along with json_schema.
If doing so, or using “strict”:false, then your system prompting must be very elaborate and specific in mandating JSON as the only allowed response type, and if not sending by schema, lay out exactly the JSON format required.
Make it resilient enough to survive stealth changes by OpenAI in the AI model quality offered in the same model name…
If you don’t have a defined format but expect plain text, and might not even use functions, then this is a very bad AI output and model regression. Responses also doesn’t offer a logit_bias to counter tabs (\t) you’d never want.
I am using a json schema for the response with strict set to true, validated in the API playground. The structure returned is correct, even when this happens, it’s just that the chat portion of the response is now sometimes just \n and \r at random. I have worked around it by catching responses like that and switching to 4o-mini when detected. I have so far only seen this behavior with 4o (currently gpt-4o-2024-08-06) but haven’t tested with any other models. This also seems to be associated with an unusually slow response time.
Thanks for explaining more. It sounds like as soon as the AI is released from the JSON’s context-free grammar enforcement and in a string, it is more predisposed to go back to the bad symptom of JSON mode without guidance, even more so because of new fault recently.
You can reduce the top_p being employed, so that if these tokens are initially less likely than the text production, there is less random chance of a tab setting off the pattern.
gpt-4o-2024-11-20 is also a choice at the same cost. gpt-4o-2024-08-13, the destination of the “recommended model” pointer, has also seen other damaging issues, such as failing logprobs.
The message “OpenAI, stop messing with production models” is unheard for over a year. They won’t even acknowledge they broke your app, or that “snapshot” is decidedly not treated as such.
If you can furnish a replication call, we can hope there is some OpenAI response here.
Switching to gpt-4o-2024-11-20 seems to have resolved the issue for now. I can’t get it to happen with that model, but it’s now every time when setting back to 4o (which shows as 08-06 in the log). Thanks for that tip!
Out of curiosity, I tried using gpt-4.5-preview-2025-02-27 and it behaved just like 4o. Instant fail. For a little more information in case anyone wants to look into this further, it is not only responding without any text, it is hitting the max_output_tokens threshold before giving up.
Response(id=‘resp_67e4a177d074819196555a37ef4c18c2085373af44f3e882’, created_at=1743036791.0, error=None, incomplete_details=IncompleteDetails(reason=‘max_output_tokens’), …,
output=[ResponseOutputMessage(id=‘msg_67e4a1793a9481919b0cb86d68794383085373af44f3e882’, content=[ResponseOutputText(annotations=, text=‘\n \n \t \n \t \n \t \n …’, type=‘output_text’)]
Hi all! Looking into this one.
Here’s a hypothetical that may inform an investigation:
What if the special “model” of JSON mode (json_object) remained switched on in the endpoint even when then upgrading the response format to json_schema?
Special weights for JSON that were trained, and then poorly-informed from context, could be waiting for their opportunity to be released into a string to express their symptom…
I’m seeing similar responses occasionally as well, with 4o-mini. My request type is json_schema, in stream mode. It does a great job adhering to the schema that I provide, but on occasion, I’ll get a response that is effectively an infinite stream of \n characters, with some spaces mixed in occasionally.
I just had this happen on 2 requests in a row, which is something I’ve never seen before - hoping this isn’t becoming more prevalent
Thank you both for the reports! This seems to be currently expected, but the team is looking into improving the model behavior here. Will keep this thread updated.
A little late to the thread, but we were experiencing similar issues a LOT on our platform [1/10 requests]. What seemed to have (almost) solved the issue is asking for a json in a single line.
For a schema:
class SampleSchema(BaseModel):
key_1: str
key_2: str
key_3: str
I’ll prompt it:
Ensure to respond a compact json in a single line with no new lines or tabs: {key_1: …, key_2: …, key_3: …}
It’s not a hard fix, but seems to have reduced our errors by almost 99%
Same issue here.
I’m using structured output (json_schema) with streaming mode, and I’ve seen the issue intermittently — mostly with non-Latin output, such as Khmer and Chinese.
I’m using the Chat Completion API, by the way. So the issue isn’t with the API type, but with the underlying model itself.
And? Did you manage to solve it? Because today is August 27, 2025, and I’m still getting the same error.
Hi everyone! I’m glad to share a bit more context on what’s going on. This behavior is tied to how constrained sampling works when strict=True is used with structured outputs.
In those cases, the model is only allowed to generate tokens that fit the defined JSON schema. If it reaches a low-probability state where it isn’t sure what to produce, it can sometimes fall back into repeating whitespace tokens (\n, \t, etc.). While it may look odd, this is an expected edge case of constrained sampling rather than a regression.
There are a few mitigations that developers have found helpful:
-
Simplifying or clarifying your schema and prompt, especially the response format.
-
Prepending explicit guidance like “Never output repeated characters or gibberish in your JSON output.”
-
Asking for compact JSON in a single line (no tabs/newlines), which has reduced errors for some.
-
Lowering randomness (
top_p,temperature) to reduce the chance of drifting into whitespace. -
Switching to a newer snapshot (for example,
gpt-4o-2024-11-20) or4o-mini, which others have found more consistent.
I do wish I could be more helpful here, but hopefully this at least slightly helps clarify what’s happening and gives you some concrete steps to try!
I think it is rather a continued manifestation of the model’s training, seen as early as the introduction as json_object in gpt-4-turbo in 2023.
OpenAI trained an AI on making JSON that was “beautified” with linefeeds and tabs. Thus they had to force developers to even write the word “JSON” somewhere to stop loops of this nonsense being predicted, and the AI works best being told exactly the schema needed still.
The AI still now easily goes happy on making those tabs as soon as it is released from being within constrained sampling and is into a string where it can write freeform.
It is thus not a fault with the CFG constraint, but the submodel used to produce them.
Solved: “Invalid \uXXXX escape” errors in Structured Outputs - The Non-Breaking Space Problem
Problem
When using OpenAI’s Structured Outputs (JSON schema mode) with gpt-4o-mini, I encountered JSON parsing errors when processing large document sets:
JSONDecodeError: Invalid \uXXXX escape: line 1 column 24597 (char 24596)finish_reason='length'(hitting max_completion_tokens limit)- LLM response filled with thousands of null bytes (
\u0000)
The errors appeared consistently when processing PDFs with large prompts (615KB, 315 snippets). The LLM would hit the token limit and the response would be truncated mid-escape-sequence.
Root Cause
Non-Breaking Space (U+00A0)
This single character caused the LLM to generate repetitive null bytes when processing large prompts. The null bytes filled the max_completion_tokens limit, causing truncation and malformed JSON.
Non-breaking space is extremely common in PDFs (used for spacing/formatting) but invisible to human review. When accumulated across hundreds of snippets, it triggers unusual LLM behavior.
Solution
Replace non-breaking spaces (and zero-width characters) with regular spaces BEFORE sending to the LLM. This preserves word boundaries while preventing the issue.
Key principle: Clean at source (parse time), not at consumption.
Python Implementation
import re
# Characters that should be replaced with spaces to preserve word boundaries
_REPLACE_WITH_SPACE = re.compile(r'[\u00A0\u200B\u200C\u200D\u2060]')
# Formatting-only invisible characters that can be safely removed
_FORMATTING_CHARS = re.compile(r'[\uFEFF\u200E\u200F\u202A-\u202E]')
def sanitize_for_json(text: str) -> str:
"""Remove invisible characters that cause LLM processing issues.
Replaces non-breaking spaces and zero-width characters with regular spaces
to preserve word boundaries. Removes formatting-only characters (BOM, bidi
controls) that don't affect word boundaries.
"""
if not text:
return text
# Replace invisible spaces with regular spaces to preserve word boundaries
text = _REPLACE_WITH_SPACE.sub(' ', text)
# Remove formatting-only invisible characters
text = _FORMATTING_CHARS.sub('', text)
return text
Character Details
Replaced with space (preserve word boundaries):
- \u00A0 - Non-breaking space (THE KEY FIX - causes LLM null byte generation)
- \u200B - Zero width space
- \u200C - Zero width non-joiner
- \u200D - Zero width joiner
- \u2060 - Word joiner
Removed (formatting only):
- \uFEFF - BOM / Zero width no-break space
- \u200E - Left-to-right mark
- \u200F - Right-to-left mark
- \u202A-\u202E - Bidi embedding/override controls
Usage
from openai import OpenAI
# Clean text BEFORE sending to LLM
cleaned_text = sanitize_for_json(raw_document_text)
# Now use cleaned_text in your prompt
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "Extract structured data..."},
{"role": "user", "content": cleaned_text}
],
response_format={"type": "json_schema", "json_schema": {...}}
)
Results
After implementing non-breaking space replacement:
- ✅ 100% success rate across all documents
- ✅ Zero JSON parsing errors
- ✅ No null byte generation
- ✅ Preserves all visible Unicode and word boundaries
Key Insights
1. Volume matters: The issue only appears with large prompts. Small prompts work fine even with non-breaking spaces.
2. LLM behavior: The LLM doesn't fail directly - it generates null bytes when encountering certain Unicode patterns in large contexts, which fills the token budget.
3. Word boundaries matter: Replace with space, don't just remove - preserves text readability and prevents words from concatenating.
4. Smart quotes and other visible punctuation are NOT problematic - only non-breaking space causes this issue.
When This Applies
- PDF parsing, HTML scraping, OCR output
- Large documents (100+ pages) with many snippets (300+ excerpts)
- Any LLM at max_completion_tokens limits
- Non-breaking space is pervasive in PDFs but invisible during review
Hope this helps others encountering similar issues!
Much cleaner - removed the entire binary search methodology section and focused on what matters: the problem, the cause, and the solution.
I am seeing this same error often (using gpt 5). The thing that seemed to be triggering it for us was what was suggested with nullable schemas:
"anyOf": [
{
"$ref": "#/$defs/linked_list_node"
},
{
"type": "null"
}
]
https://platform.openai.com/docs/guides/structured-outputs
It seems like the model was struggling with wanting to omit the values instead of returning null for them