Having unique assistant for all my users

Hey folks,

I’ve already searched for the posts here but saw lots of people experiencing the same issue.

What I’m trying to do

  • I have 100 users.
  • All my users have 1MB of data on the platform. (Unique to them)
  • I want to allow them chat about their own data.

What confuses me is that the pricing and feasability of using Assistant API for that purpose.

As I only have 1MB, what will be the pricing? The official retrieval pricing is $0.20 / GB / assistant / day.

So, if my calculation is correct, I’ll pay “0.20 (GB price) / 1024 (meaning 1MB) * 100 (user or assistant count) * dayCount + Chat Tokens”

Does that sound right?

I’m also not sure about this approach. I saw some people mentioned attaching file_ids to threads, instead of using Assistants API. Do you have any suggestions?

Yep pretty spot on.

Attach it to the Assistants, would be my experience and recommendation.

1 Like

I would assume their internal chunking would mean not the whole megabyte is passed, only the 1k relevant to the prompt. But I dn’t know

The main point is that you stay in control of you add the message yourself.

Then you can form a much better estimate of what to expect with regards to performance and costs.

https://platform.openai.com/docs/api-reference/messages/createMessage

1 Like

Good advice.

When I upload a file to a Thread once, it will always be available for ChatGPT to rely on right?

If uploading a document once to a Thread makes the ChatGPT perform like an Assistant (does not have to identical), I can work on that.

Edit:
My ultimate goal is cutting costs and having unique assistants for all users. So paying for chat tokens is OK for me. But what if use Pinecode or any other vector DB for storing embeddings? That would storage costs, which would be 0.20 * 100 * 30 = $600/mo.

Does that sound reasonable?

If you are using PineCone, then you would not want to store the files on Assistant. You would store the embeddings on PineCone, and each time, send an update Assistant Request with an updated Assistant instruction where the chunks are extracted from PineCone.

You would just pay PineCone for their base package.

1 Like