Assistants response time 6-10s

Hi there, we have been using OpenAssistants for two months now and the response time has been painfully slow. We’re doing non-streaming and each response is about 100-300 tokens.

We tested by calling Assistant using Lambda and then directly on the frontend. And both times the response time takes anywhere between 6-7 seconds.

Our prompts are not that big about 400 tokens.

In other use cases, we have seen the response take up to 15 seconds! For folks who are using Assistants. What are your typical response times?

Any things you have done that lowered response time?