I am submitting up to about 18 questions through the api successively. Each question is about 7k tokens all in. The first 10 calls work great, but after about the 10th call I get a timeout on my http request. I have my timeout maxed at 120000 ms, so I can’t increase it. If I take the same question and run it in postman it works fine and pretty fast usually under 20 seconds. If I rearrange the order of the questions, it will always timeout on the 10th or so call. Anyone seen this and know how to fix it?
Each question uses about 7k tokens, so around the 70k mark it craps out.
Our Token per minute rate just went to 300,000 today for gpt-4