Alternative to Assistant or how to reduce response time?

jblake · July 15, 2024, 5:43pm

thank you for the suggestion but its the time from the request to the first delta delivery when streaming since the assistant sends all of the history with every request.

For example a new thread will return a message in 0.5 seconds but a thread with 10 message and 8k of text will take 20+ seconds, even if the current question is only a sentence.

https://platform.openai.com/docs/api-reference/runs

Topic		Replies	Views
Assistant with document_search enabled - long response time API gpt-4 , assistants-api	1	363	June 7, 2024
Why Assistants API is Slow? Any speed solution? API api-speed , openai , rag , assistants-api	15	8752	September 10, 2024
Speeding up the response from the openai's assistant api API gpt-4 , assistants-api	2	2223	July 17, 2024
How I can send user messages towards an assistant with less api calls? API assistants-api	0	85	November 12, 2024
Assistants API - best way to get the reply to a given user message? API assistants-api	6	4182	July 10, 2024

Alternative to Assistant or how to reduce response time?

Related topics