Differnet ways to Summarize the user Chat History

What are the Differnet ways to Summarize the usser Chat History , i have a bot which is alredy used token a lot, per user wise ,and it is affecting Bot answer also

If i am summarizing chat history using another call openai, then there are two API
calls , it is incresing cost

Is there any other ways other than langchain memory or making another call to openai telling to summarize


With the API, it’s not about the total calls but the number of tokens based on which the costs are calculated, so one possible suggestion would be the periodically summerize the chat the user has already had and then used the summary plus the newer chat in the next call to build on top of it.

Based on how you prompt the summary generation in the first place and level of detail you want in the summary, the overall token usage should be lower compared to making multiple calls

1 Like

I made an app that summarized a chatlog so that the future conversations had context (it was to give an NPC a “memory”). I used chat completions api with 3.5turbo to summarize the previous run of the apps chat log into text files, then fed this summary to assistants api with 3.5 turbo. With it using assistants api, the conversation kept the context without having to keep feeding it a summary every prompt, it just needed a summary only once, at the start of the app. Worked out not too expensive with it using 3.5turbo. I tried it with 4 and it is better but gets expensive quickly.

1 Like

Check the forum for posts/work by @stevenic in this area…

Lots of advances recently…

1 Like

Both GPT-4 and GPT-3.5 are capable of performing more than one task in a single call. What that means is that you can easily get the model to return a JSON object like this:

“ResponseText”: “{response to send user}”,
“ConversationSummary”: “{summary of the conversation}”

This will keep a running summary of the current conversation in a single model call. The thing to keep in mind though is that this summary is going to be lossy which may lead to two problems:

  1. Important details could get dropped. So if your chatting about an up coming trip the summery might drop the name of the city your traveling to or who you’re going with.
  2. Conversational features like co-references and anaphora may stop working. For example, if the user says “make that a large” the model may not be able to resolve what “that” is referring too because of the summarization process.