Chatbot Assistant Implementation Feedback

I would love some input on my chatbot solution.

Context: web application for ecommerce (React/SpringBoot/Hibernate)

I currently have a working assistant with Retrieval that receives a file upon creation. The file holds content of around 500 FAQ questions. I have tried using embeddings succesfully, and am aware that i can optimize this a lot but would prefer to use Assistant if at all possible.

The assistant implementation will probably be put inside a docker container and be run indefinitely. My backend will handle all calls through this container.

I am aware that there is currently a 60 req/min limit, but i hope this shouldn’t be a big issue. I am not expecting a lot of concurrent users. (max 10) But i am unsure of what happens when the limit is reached.

Is the current beta state of the Assistant unsuited for my purposes?

Regarding pricing the docs say i will be charged for every new assistant instance. (Retrieval is priced at $0.20/GB per assistant per day. ) So I am hoping that running this instance once every 24 hours will keep my costs low, but i cannot verify wether this is exactly the case or wether i am missing something.


Welcome to the community!

I’m thinking of it like this:

  1. use a custom gpt to just prototype a basic idea
  2. use assistants if you want to grant access to non-chatgpt+ subscribers
  3. use completions and embeddings for a larger production app where you want to control costs.

If an outage or deprecation of the assistants api isn’t really a showstopper for you I would say it’s fine. And given that azure has assistants now, you can always have that as a backup for business continuity.

You’d get an 429 error. It’s not the worst idea to put all requests into a queue, and retry with exponential backoff if you run into a 429. If you wanna be super fancy, you can try to parse the error message to figure out how long you need to wait, but there’s no guarantee of that staying consistent.

you will also be charged for context tokens and tokens generated. A lot of people are surprised by how many context tokens assistants eat.

Overall I’d say this: assistants can save you a lot in terms of development time, but do generate significant operating costs. The answer to whether that’s worth it is always “it depends” lol :laughing:


Thanks for the thorough response! I will have a discussion about the pricing at work, and for now keep working on optimizing a more regular approach. (And perhaps wait until Assistants is out of beta)

1 Like