Rate limits in middle of stream

onk · May 24, 2023, 6:34pm

I’m seeing issues with streaming /chat/completions. I’m not using the Python package, we’ve written it in Go and are seeing things working great most of the time. We’re confident it’s not an issue with our setup. Sometimes, however, we seem to be getting nothing streamed back.

The request is legitimate and we get a 200 response back. The first 2 or 3 tokens will be sent, and then nothing. We set an 80 second timeout on the request, and what we’re seeing is a ~2 second delay till first token, then another token, perhaps a third token, and then nothing.

Our leading theory is that we being rate limited in the middle of the stream. I’d think not since we got a 200 initially, but that’s our only thread right now.

Foxalabs · May 24, 2023, 6:40pm

I think I’d be tempted to spin up a piece of python test code that replicates what you are requesting using the prebuilt openai lib, if you get the same issues then you know it’s down to a busy server or network errors that need to be handled, if the issue goes away using the boilerplate library, you know that there is some unhandled state in your code.

BrianLovesAI · May 24, 2023, 7:34pm

This topic is really good for me as I am facing a similar issue with GPT-4 on my company account. Sometimes, the streaming starts to print only one or two words, and then it completely gets stuck until the three-minute timeout is triggered by our backend server. I am not using Python or Go. Is there anyone else who has experienced this? Has anyone found a solution?

onk · May 24, 2023, 8:39pm

I really don’t think it’s an issue with how we’re handling it, but we can try that to be sure.

Foxalabs · May 24, 2023, 9:40pm

I suspect it’s just everyday data transport issues and perhaps some server side problems, but it’s wise to build up some sanity checks for a baseline.

BrianLovesAI · May 31, 2023, 12:56am

Just in case, if someone is facing the same issue.

Currently, I have temporarily or maybe permanently added two options with CURL.

CURLOPT_NOPROGRESS and CURLOPT_PROGRESSFUNCTION options are added to check if the streaming is in progress. If it exceeds more than 5 seconds, I kill that connection.

Foxalabs · June 7, 2023, 4:15pm

Nice, and have you found that works reliably for you? I think I’d be tempted to up that 5 seconds to something more internet realistic, 30 seconds?

BrianLovesAI · June 8, 2023, 5:40am

Yeah, it is now almost well stabilized. And nah, 5 seconds is enough. That is just for one word. One word taking 30 seconds is not acceptable.

CURLOPT_TIMEOUT - 180 seconds
CURLOPT_CONNECTTIMEOUT - 5 seconds
CURLOPT_NOPROGRESS / CURLOPT_PROGRESSFUNCTION - 5 seconds

Easy peasy

Topic		Replies	Views
GPT-4 API slow response over 60sec API	6	2534	February 16, 2024
Stream Completion Stops after 3-5 Responses with No Error? API	3	1323	December 18, 2023
Completion vs. chat performance API api-speed	3	3249	December 24, 2023
How to speed up GPT4 generation Feedback gpt-4 , chatgpt , api	10	6068	January 29, 2024
When will the response time/timeout issue be addressed? API gpt-4	1	1415	November 2, 2023

Rate limits in middle of stream

Related topics