I’m increasingly concerned about the future (or lack thereof) for the assistants API. However, I’ve been building a platform based on it. At the center of the platform is a variety of uniquely tailored (personality and knowledge as defined in their instructions) assistants.
If you had to recommend and alternative to Assistants API what would it be? I fear I may need to begin working on a substitute as we near soft launch readiness.
Thanks, @javidd . I’m leaning that way. We’ve already belt a platform that does much of the wrapper functionality we need. Have you seen a good summary of the gaps between Completions and Assistants API?
With assistants - you can’t manage context usage (for memory). After 4-5 messages you will see that it uses between 9000-12000 tokens on each answer
Api works slowly if you want to save messages in your own database. And, in the end you have save them in your db. Otherwise you have to do api call to get threads (which are chats), messages etc. it takes time. Users don’t like waiting.
May be you want to use other models in the future, for example claude, grok, gemini etc. Then you will not be able to use them with openai assistants
Good part - you can send files to assistant and read them.
With completions:
You can manage context usage
It will not be slow,
You will be able to use other AI models. You will just need to connect their API
For file reading - you will need to implement it yourself. For example in our project we can read pdf, doc, txt, excel, csv, pptx documents. We have even added functionality sending audio messages to gpt.
So by using completions api you have more “freedom”. It just takes a little more time rather than using assistants API.
Thanks for the excellent summary. I feel foolish but what’s the benefit to Assistants then? I went down that path and now I’m regretting it. With completions you give the instructions to which can differentiate your “personalities” and knowledge right?
Of course, you can give instructions as system prompt. Using Assistants you may not need to implement saving chats, messages in your database. because implementing that functionality to save them, to save separate chats etc. takes some time. some message types are text, some image etc.
Just from speaking from my experience – I had built my own setup prior to Assistants API – and there really is no comparison. Not because my previous implementation was somehow hampered, but because the Assistants API is that good.
You’ll regret going with something else when next month they release the next version of Assistants that fixes everything and probably has some insanely good RAG/CMS thing attached to it.
Yes, there is a “black box” that eventually gets the token context up to 90k-128k tokens but the end result is so much better. My users are extremely happy with the “memory” aspect of my app, often comparing it to Claude or Chat, and saying that the others don’t come close to “remembering” their story–which just means that its insanely good at drawing up context from the entire conversation.
Storing the messages in the DB is inconsequential if you make the call after the response has streamed. It takes a split second and you can do it while they’re reading the previous response, or starting to type the next and then just store that.
Just know that no matter which direction you go, everything will change in less than two months and then you’ll have to start all over again.
I noticed that during 12 days of OpenAI there were no updates to the Assistants API.
Personally, I grew really frustrated by the limitations of the Assistants API, of which there are many. The final straw for me was when I reached out to OpenAI Support for a major problem and the AI-based responses were full of hallucinations suggesting changes to options for features that did not exist in the Assistants API.
If you have not run into limitations or unresolved problems of the Assistants API, you’re lucky but you may eventually. An OpenAI vector store supports a maximum of 10,000 files, only supports a single vector store per assistant, and has very limited model support. Compound all of that with the black-box nature of the service and the lack of support from OpenAI makes for a poor experience.
So I reimplemented OpenAI Assistants API wire-compatibility in my own service. You get a much more scalable solution, can query multiple vector stores in parallel, use nearly every model under the sun with the existing Assistants SDK and you will get direct support from engineering.
It works with the existing OpenAI SDKs, supports function calling, streaming responses, annotations, choice of embedding models, cosine/eucliean/dot-product search options, just to name a few.
The service is called Ragwalla. Feel free to ping me if I can help.
Can’t run a business on “just wait two more weeks bro”. Assistants API is obviously not a high priority item for OpenAI. Not sure what happened with your implementation, but I’ve posted benchmarks in other threads comparing the Assistant API to Completions. Even with a fresh thread that has 0 messages, it runs at half the speed of a completion at best, and usually about 30% as fast.
Also would like to know if assistant API is ever leaving beta? It’s great that o3 is accessible on it and it feels like it getting somewhat more stable lately but for five-six months now it’s been a mess and it makes me wonder if the assistant API is a future priority for OpenAI or a forgotten project? Would be good with some guidance and clarity so we know whether it’s worth even considering as an alternative or if we should go with other options.
I also believe the Assistants API is a great idea and OpenAI has the potential to develop managed services for memory, RAG, and code sandbox better than I could on my own—not to mention the opportunity cost involved. I just wish there had been more progress.
Hi Guys.. This thread is old .. sorry to ask something!!.. but what happens with GPT-5 and above.. OpenAI does not support assistant API for GPT-5 and above.. it is being abolished. any idea?
Hey @ashkansamimi . @javidd is right - assistants is going away. It simplified things for many uses, but at this point, you need to make the change. Here’s what we did in nutshell:
Moved to completions
Track and manage sessions for consistency
Put the platform-level instructions at the System instructions level
Put what was the Assistant Instructions at the Developer level instructions
User requests / input go to the user level instructions
Save chat history locally and upload it if needed and / or if session id is missing
Hi @javidd.. No .. you cannot use assistance API with GPT-5 and above. You need to send the entire change history as prompt each time, if you want keep it in a thread. Assistant API “is” not perfect (is in quotes, because it still works in GPT-4x) .. but at least you could keep track of the context and get back to it every time..
Thanks @blichtenwalner .. how technically it would work without assistant API is clear to me.. The question here is , why openai would ditch something working rather good (not perfect!!) and force us to come back with individual solutions!!?? god knows, what the next thing would be that would force you to change your solution. we need consistent and overall solutions, not custom made.. My opinion by the way
actually assistants api is not good and is not “agentic“. but responses api is agentic. there is no meaning to keep it. so they have to deprecate it.
I did not use assistants api to be honest. Instead I implemented many things myself