Hi all,
I’m building a tool that lets users analyze (often quite large) texts. They may have different questions/needs regarding the same text at different points in time.
I’m worried that sending this text over and over to chat-gpt (queries include “explain X in this text”, “what’s the most important points in it” etc.) might burn my credits faster than I’d like. Please note, I’m still new to actually understanding cost-effectiveness and tokens in general.
Maybe there’s a way to optimize this process?
To me it’s related to maintaining “sessions” between a certain user and chatGPT via my server. So, maybe there are established practices for not having to send the entire chat history over and over to chatGPT and first do something with it server-side? Like, maybe I could utilize embeddings in some way?
Any help would be much appreciated.