GPT-4.1 Responses API stalls once convo → ~350 k tokens (started 11 Jul 2025) — anyone else?

Since this morning (11 July 2025) we’ve noticed that many streaming requests to the Responses API with gpt-4.1-2025-04-14 freeze once the running conversation—including JSON returned by multiple function calls—hits roughly 300–400 k tokens. Yesterday the exact same workflow sailed past 600 k tokens without any issues.

We’ve found a temporary workaround: trim each tool response. For instance, returning the top 25 records instead of 100 keeps the total context under ~330 k tokens, and the stall never occurs.

Is anyone else seeing this behavior? It seems directly tied to longer context interactions with gpt4.1 and the responses API.

Thanks!