Chatbot Assistant Implementation Feedback

stijn.doeleman · March 23, 2024, 3:10pm

I would love some input on my chatbot solution.

Context: web application for ecommerce (React/SpringBoot/Hibernate)

I currently have a working assistant with Retrieval that receives a file upon creation. The file holds content of around 500 FAQ questions. I have tried using embeddings succesfully, and am aware that i can optimize this a lot but would prefer to use Assistant if at all possible.

The assistant implementation will probably be put inside a docker container and be run indefinitely. My backend will handle all calls through this container.

I am aware that there is currently a 60 req/min limit, but i hope this shouldn’t be a big issue. I am not expecting a lot of concurrent users. (max 10) But i am unsure of what happens when the limit is reached.

Is the current beta state of the Assistant unsuited for my purposes?

Regarding pricing the docs say i will be charged for every new assistant instance. (Retrieval is priced at $0.20/GB per assistant per day. ) So I am hoping that running this instance once every 24 hours will keep my costs low, but i cannot verify wether this is exactly the case or wether i am missing something.

Thanks!

Diet · March 24, 2024, 4:50am

Welcome to the community!

I’m thinking of it like this:

use a custom gpt to just prototype a basic idea
use assistants if you want to grant access to non-chatgpt+ subscribers
use completions and embeddings for a larger production app where you want to control costs.

If an outage or deprecation of the assistants api isn’t really a showstopper for you I would say it’s fine. And given that azure has assistants now, you can always have that as a backup for business continuity.

You’d get an 429 error. It’s not the worst idea to put all requests into a queue, and retry with exponential backoff if you run into a 429. If you wanna be super fancy, you can try to parse the error message to figure out how long you need to wait, but there’s no guarantee of that staying consistent.

you will also be charged for context tokens and tokens generated. A lot of people are surprised by how many context tokens assistants eat.

Overall I’d say this: assistants can save you a lot in terms of development time, but do generate significant operating costs. The answer to whether that’s worth it is always “it depends” lol

stijn.doeleman · March 24, 2024, 8:15am

Thanks for the thorough response! I will have a discussion about the pricing at work, and for now keep working on optimizing a more regular approach. (And perhaps wait until Assistants is out of beta)

Topic		Replies	Views
Assistants/File Retrieval for a Production App API api	3	1216	December 17, 2023
Assistant vs Chat with Embeddings API embeddings , chatgpt , api , assistants-api	0	164	October 9, 2024
Cost of Different Assistants API Functions API api , assistants-api	0	147	July 1, 2024
Having unique assistant for all my users API	5	447	March 7, 2024
Choosing between GPT Assistant and Chat Completion. Which one is better? API chat-completion , assistants , assistants-api	2	1162	May 31, 2024

Chatbot Assistant Implementation Feedback

Related topics