Optimization of large requests to GPT

Hi all,

I’m building a tool that lets users analyze (often quite large) texts. They may have different questions/needs regarding the same text at different points in time.

I’m worried that sending this text over and over to chat-gpt (queries include “explain X in this text”, “what’s the most important points in it” etc.) might burn my credits faster than I’d like. Please note, I’m still new to actually understanding cost-effectiveness and tokens in general.

Maybe there’s a way to optimize this process?

To me it’s related to maintaining “sessions” between a certain user and chatGPT via my server. So, maybe there are established practices for not having to send the entire chat history over and over to chatGPT and first do something with it server-side? Like, maybe I could utilize embeddings in some way?

Any help would be much appreciated.

2 Likes

Hey @teamdrella,

First idea that comes in mind is writing your prompt in a .docx or .pdf document. Then try uploading it in GPT-4 and say “Attached is the prompt”. Let me know if that works?

Word limits (as of late November 2023):
GPT-3.5: 2,048 tokens (about 1,500 words)
GPT-4: 4,000 tokens (about 3,000 words)

I also wrote a guide here related to the word limit per prompt, per GPT model.