Is the Assistants API really made to create efficient chatbots?

Has anyone created a chatbot with this? Because it is really VERY slow. Even for a simple message with a Assistant without any configuration, testing from the playground it can take up to 10s to reply. I’ve read in some forum posts that many people recommend using Completions and I think that’s what I’m going to go for. Or is there any way to improve the response time using Assistant API?

3 Likes

Hi, and welcome to the community!

The Assistants API has never been the fastest, but many prefer it for its ease of use in managing conversation history, file uploads, a knowledge base, and tools like the code interpreter.

That’s the main advantage of using the Assistants API—it enables quick and easy prototyping. If your use case isn’t time-sensitive, it may be a sufficiently good option.

On the other hand, using the Completions API requires you to build all the necessary functionality yourself. As you’ve already discovered, it typically has lower latency and offers more flexibility to optimize overall system performance.

I also want to mention that OpenAI is still actively developing the Assistants API. Personally, I recommend using the Completions API, but that’s just my opinion.

5 Likes

Vb has already answered your question.

The Assistants API is in beta and primarily designed for retrieval-augmented generation (RAG), which involves multiple steps like query optimisation, retrieving, reranking, and generating responses, also tools/functions use—leading to an average latency of 5-10 seconds.

If speed is the priority, use a API instead of assistant with a fast model, limit nodes, enable streaming, keep the persona concise, and restrict output tokens.

However, the Assistants API simplifies performing RAG and tools/functions, making it useful for quick prototyping.

If your use case isn’t time-sensitive, it can still be a decent option.

1 Like

Agreed. I’m timing requests and it takes 2-4 seconds only to queue up the run. Unacceptable. At least streaming makes it nearly bearable but any chatbot using it crawls on until the user quits out of boredom.

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.