I used GPT-4.1 for system testing, and everything worked normally across 2,000 interactions. Then I switched to GPT-4o-mini for testing. All responses with fewer than 10,000 tokens were returned correctly. However, any response larger than 10,000 tokens was never received — even though the billing records show the interactions were successful.
My network environment is admittedly unstable, as I was using unlimited mobile data for testing.
Has anyone encountered a similar issue?
Is there any way to investigate the root cause of this?
Would it be possible to split large responses to ensure successful reception?
I’d greatly appreciate any help analyzing this issue.
Thank you!