I have a SaaS using the Assistant API and we are facing a lot of failure since yesterday.
One problem with the Assistant API with those messages :
You’ve exceeded the rate limit, please slow down and try again later.
The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the request ID req_xxx in your email.
A 429 status code (You’ve exceeded […]) is sometimes used to indicate that the server itself is globally limited, and may not necessarily reflect your personal limits.
I can’t say it’s for certain, but there have been a lot of recent issues with Assistants based on the forum posts.
Best bet is just to be patient. It’d be interesting to know where your calling host geographically resides.
I run Assistants in a Canadian location (fly io) and haven’t experienced any slowdowns. So it may be worthwhile to have another instance setup in other locations?
We had a lot of issues and down times with the Assistant/VectorStore APIs
I guess we will have to do all the RAG stuff (chunking, embedding, vector DB) manually/locally after all and only use OpenAI for the final inference until they get the Assistant API out of Beta and actually working reliably…