I am developing an application that calls the GPT4 Chat completion API. I have found that after 2 simultaneous calls, the OpenAI server is frozen, waiting for a call to complete.
Is there a way around this limitation?
A specific pro subscription, maybe?
I’d like to know as well. The limits (as reported by my account dashboard) are still fairly small at 500rpm, but significantly larger than two concurrent calls. Also, I seem to be being charged for calls that aren’t completed.
I can do up to 20 parallel requests of significant size on azure with gpt-4 32k.
Finally found a use for that too.
There is some enterprise level stuff as well. You need to contact sales.