Implement conversation summary buffer

hexarrior · February 27, 2024, 5:21pm

Our product is currently using the pruning method, which is just trim the old messages as the assistant api does. Not sure which method ChatGPT is using.

The problem is that summarization is dynamicly performed to keep token count within the limit. Do you have any ideas about how to design the database schema or other technologies to persist the conversation history while efficiently writes and retrieve? For example, storing the whole conversation history in mysql or pg as json involves marshalling and unmarshalling, will it be a performance concern in a high concurrency env?

Macha · February 27, 2024, 11:33pm

Hey there!

So, vector databases do quite well in these scenarios.

Something I picked up along my own journey when working with high concurrency is simply this: enhance the read-only actions while reducing the amount write actions significantly.

Oftentimes, writing to a DB can easily end up being a FIFO situation per task. This could create some bottlenecks when you’re frequently writing to the database. Read actions though are commonly intended to be high-concurrency actions. Meaning, it’s fine when all different kinds of functions want to read the database at the same time, but it’s not fine when a bunch of functions are trying to write to the database at the same time.

hexarrior · February 28, 2024, 6:02am

Hey, thanks for the reply. Are you suggesting that a solution is to use the vector db storing pruned conversation history and to retrieve only relevant pieces and insert into the prompt? In other words, we do not need in the prompt the summary of whole pruned conversation history.

Macha · February 29, 2024, 12:16am

Correct! That is the major benefit of RAG; you can “prune” anything to whatever you want, and embed them so you can retrieve the relevant chunks as represented by the embedding.

Topic		Replies	Views
How does ChatGPT store history of chat Prompting api , summarize-text	5	19591	December 17, 2023
Has anyone brainstormed a cost efficient way to include the chat history for conversation-based applications? API	8	3502	July 21, 2023
Seeking guidance on managing long conversations and token limits while implementing ChatGPT in a mobile app for a design application API	6	2476	November 15, 2023
Managing Context in a Conversation Bot with Fixed Token Limits API gpt-4 , api	2	616	January 16, 2025
Retaining statefulness / context in long conversations Prompting	6	2711	December 21, 2023

Implement conversation summary buffer

Related topics