Error while using Assistants api

Basically yes. Unlike ChatGPT where the conversation is clipped to where people complain it doesn’t remember anything, assistants will run up the conversation to the maximum of the model when you continue to chat. 128k.

Also, when running your own vector database, for example with 1MB of your company’s tech support knowledge base and product offerings, you might have a threshold where only the top 5 chunks are fed to the AI, and only if they meet a semantic similarity threshold. Not the case with assistants - if you ask “how’s your day going”, the AI gets maximum retrieval placed into the context window.

Those are prices and anecdotes taken right from the forum. The AI looping until it hits your API rate limit and you get no answer. AI looping, calling your API over and over with the same query.

Until they offer transparency about billing and realitime per-call token usage, and allow controls over data and iterations similar to what a reasonable person may program themselves, I would have to say “program yourself”.

3 Likes