Hi!
I’ve tried to swich back from GPT-4.1 to GPT-5 in an app that uses Responses API.
The system prompt is not too long, it’s 7k characters. And the user’s message is generally conversational, very short.
We are however passing previous_message_id and we also use file_search tool and pass vector_store_ids.
And very frequently the Responses API give
{
code: ‘context_length_exceeded’,
message: ‘Your input exceeds the context window of this model. Please adjust your input and try again.’
}
Which is confusing to me, since we’re not passing a lot of context. So I assume the context is exceeded either because of file_search or because too much history is being loaded via previous_message_id but both of these are handled by OpenAI so I’d think context size would be naturally handled and the API would not try to load up too much file content or too much message history.
Is this a bug or is there something I can do?
Instead of previous_message_id I could potentially manage history myself and pass only X last messages but if the file_search is the issue then reimplementing that would be much more difficult.
Thanks for any advice!