As you have more and more history to go through, the amount of data you have will continue to grow. I have achieved some success here by having a separate “synopsis” function that takes, as input, the synopsis of the current chat logs (to know what the current conversation is about) and from that it generates keywords to search for old chat logs that are relevant. It then uses the same “synopsis” method to summarize those old chat logs, effectively compressing an arbitrary amount of past data to a few lines of text.
Put another way:
- Current chat log input and summarized
- Keywords extracted from current chat summary
- Old chat logs pulled from DB/SOLR via keywords
- Old chat logs also summarized, based on relevance to current chat log
You do lose some data of course, but we have relatively narrow constraints.