I am using Assistant v2, and the run function takes too long to process. I want to implement streaming messages like ChatGPT to show to the end user, but it is currently slow, causing the user to wait over 30 seconds most of the time.
Experiencing similar behavior. Using an assistant in Playground shows that the performance is solid, but via any API-call, it’s very slow. We notice about 50-60% fewer chunks per second compared to an identical call using the Chat-endpoint.
This is a bare-bones assistant with no tools, no files and with the same GPT-4o model.