Responses API: how to manage context size?

Martin_Malinda · September 12, 2025, 10:48am

Hi!

I’ve tried to swich back from GPT-4.1 to GPT-5 in an app that uses Responses API.

The system prompt is not too long, it’s 7k characters. And the user’s message is generally conversational, very short.

We are however passing previous_message_id and we also use file_search tool and pass vector_store_ids.

And very frequently the Responses API give

{
code: ‘context_length_exceeded’,
message: ‘Your input exceeds the context window of this model. Please adjust your input and try again.’
}

Which is confusing to me, since we’re not passing a lot of context. So I assume the context is exceeded either because of file_search or because too much history is being loaded via previous_message_id but both of these are handled by OpenAI so I’d think context size would be naturally handled and the API would not try to load up too much file content or too much message history.

Is this a bug or is there something I can do?

Instead of previous_message_id I could potentially manage history myself and pass only X last messages but if the file_search is the issue then reimplementing that would be much more difficult.

Thanks for any advice!

Martin_Malinda · September 12, 2025, 10:59am

truncate: auto seems to have fixed the issue! but it would still be nicer to have an option to be more in control how much history is being loaded

Topic		Replies	Views
GPT-3. How to reset context length after error? API	4	3973	December 18, 2023
Getting erros when using file search and conversation state with gpt 4o API	1	96	March 15, 2025
GPT-4o Context Length Issue: Input Tokens Within Limit but Exceeds Maximum API	3	4097	February 1, 2025
Error 400: Maximum context length exceeded Bugs	2	2555	September 11, 2024
Maximum Context Length Error across different models API	3	3748	December 4, 2023

Responses API: how to manage context size?

Related topics