The assistants API has been really helpful for getting our development started quickly. We would not have been able to get an MVP this fast without it. Unfortunately the time-to-first-token is just too slow. We are seeing around 1500 - 2500 ms of added latency compared to equivalent requests to the chat API. I’m really hoping this improves by the time assistants comes out of beta. As it stands we’ll have to switch to using the chat API and dealing with persistent threads and code interpreter ourselves.
Just wanted to give my honest feedback. It’s a great API. It just needs to be faster.