Assistants API too slow for realtime/production?

I am not experiencing the same type of delays that you are - I get completions to start streaming (that was a big addn) at a largely indistinguishable level from a direct call to the api.

Of much bigger issue for me has been the expansion of the context window for prompting, where the assistant consumed model max tokens as its context window - the new api update seems to provide some course grain tools to help with that and so am now testing with those in place to see if I can bring the costs in line.

i can confirm. same problem here.

I am very curious about the traffic here. Most of the morning and afternoon times (IST), the response times of assistant api (with function calling ) is averaging 4 to 8 seconds . But when i pass the same queries in the evening times (IST) , the response times are shooting up to 40 -60 seconds on avg (sometimes without a function call) .

Not sure why? People from other timezones start their day and requests keep piling up ?