Hello,
I am developing an application that calls the GPT4 Chat completion API. I have found that after 2 simultaneous calls, the OpenAI server is frozen, waiting for a call to complete.
Is there a way around this limitation?
A specific pro subscription, maybe?
2 Likes
I’d like to know as well. The limits (as reported by my account dashboard) are still fairly small at 500rpm, but significantly larger than two concurrent calls. Also, I seem to be being charged for calls that aren’t completed.
I can do up to 20 parallel requests of significant size on azure with gpt-4 32k.
Finally found a use for that too.
There is some enterprise level stuff as well. You need to contact sales.