We have a forum (similar to this one) with about the following stats:
- More than 300 millions posts
- A few billions tokens in total
- More than 20 languages
We want to use GPT to be able to get answers on the following questions
- what user X said about subject Y
- summarize the thread Z
- what is the sentiment of thread Z
- identify pain points and solutions of subject Z
I am not sure which is the best approach for this. Should we convert everything in embeddings and querying those? Use Assistants API and then use one single thread for the whole forum? Which one will give us the highest flexibility with the lowest cost trade off?
I would love to read your opinions.
and you have…
- consent at the time of message creation to license user content to submit to another company to produce new works, etc.
Obtaining a text-embedding-ada-002 vector is $0.10 per megatoken, $100 per gigatoken (billion), so a single forum embedding run is thus a “few billion tokens” = “few hundreds of dollars” for embedding.
Then you would need a strategy for individual posts that are more than will fit in context of the embedding model, such as truncation, or average the vector score of chunks.
That just gives you a semantic similarity database. You’d be able to add your own metadata like “is first post of thread”, is reply to post x, etc. Or simply add that vector to every forum post for later use.
Embedding is the cheapest thing you can do, which could power a slow search function, or you can find how similar post are to a set of “happy posts” or “angry posts”, to some degree you could experiment with. You could make only the last year searchable for a start.
Language model inference? Up that to $10,000 per gigatoken of GPT-4-turbo for input alone, and days of processing for both rate limit and generation time of what you might want.
Consider the tool you actually envision. “summarize on demand” has a cost that grows depending on how many sub-summaries are needed on gpt-3.5-turbo-1106 (16k) (or a dollar a button push for a 90k token GPT-4 thread summary).
That’s the end of considerations, “opinions”, before it then becomes consulting.