Assistants API too slow for realtime/production?

ncyoung · April 17, 2024, 2:05pm

I am not experiencing the same type of delays that you are - I get completions to start streaming (that was a big addn) at a largely indistinguishable level from a direct call to the api.

Of much bigger issue for me has been the expansion of the context window for prompting, where the assistant consumed model max tokens as its context window - the new api update seems to provide some course grain tools to help with that and so am now testing with those in place to see if I can bring the costs in line.

CS1234293429 · August 27, 2024, 6:51am

i can confirm. same problem here.

mano1 · September 16, 2024, 1:32pm

I am very curious about the traffic here. Most of the morning and afternoon times (IST), the response times of assistant api (with function calling ) is averaging 4 to 8 seconds . But when i pass the same queries in the evening times (IST) , the response times are shooting up to 40 -60 seconds on avg (sometimes without a function call) .

Not sure why? People from other timezones start their day and requests keep piling up ?

Topic		Replies	Views
We proved the API is intentionally slow API	56	17321	May 2, 2023
Gpt-3.5-turbo-1106 is very slow API chatgpt	46	7555	December 19, 2023
Runs randomly take > 30sec Bugs assistants-api	7	323	September 11, 2024
GPT-3.5 Turbo API response is slow API	20	12052	November 11, 2023
Why Assistants API is Slow? Any speed solution? API api-speed , openai , rag , assistants-api	15	7026	September 10, 2024

Assistants API too slow for realtime/production?

Related topics