Managing longer conversations with GPT API

Hello everyone,

I have a question regarding the development of my application using the GPTAPI. I’m aiming to allow users to have extended conversations with the chatbot. However, I understand that the API has a token limit that includes not only the latest user and bot messages, but the entire conversation history, as we need to send previous messages to the API every time.

So, what should I do when the conversation becomes too long? Should I omit the initial messages? If so, how can I do this effectively?

Additionally, when I use the chatGPT on the OpenAI website, I’ve noticed that even in long conversations that surely exceed the 4000 token limit, the chatbot can still reference topics from the beginning of the conversation. I’m struggling to understand how this aligns with the context window limit.

Any insights or advice would be greatly appreciated. Thank you!

A common trick is to just send the last 10 messages, or whatever fits under the token limit. There’s probably a bit of art around getting the right window size.

Another technique is to have the AI parse the last message into a variable which you can store locally. You can save certain important snippets of a conversation that way, like user name. And then recombine the data into some heading in the next prompt.

It’s a bit complex for now; I think much of the AI R&D we see will be things like this and tools like Langchain which help to manage all this.

Maybe you could give langchain a look, it’s a framework to help work with llms, i has a lot o memory options to choose from including the saving in external vector database and semantic searching what you need.