Has anyone switched from the Assistants API to Chat Completion and can share their experience here?
I am considering making the switch, but it seems I would need to reimplement threads, runs, the context window, and possibly many other mechanisms…
Any insights or recommendations on how to deal with this ?
I like the simplicity of the Assistants API but it is still in beta and there are many drawbacks making me consider switching to Chat Completion or other LLM solutions until it improves :
- Too slow for chat apps (where the users expects an instant answer)
- Instability - including critical bugs that take several hours to resolve, leaving my app completely unusable
- Not being able to use all the params we can use with chat completion (frequency penalty, max tokens, presence penalty, response format, temperature, …)
- Not being able to use fine-tuned models in assistants API, which makes my assistant answer like all other non fine-tuned assistants, always “delving deeper into the tapestry of” something, even after experimenting with a lot of different prompt instructions.
Insights, feedback, and recommendations appreciated !