Summarizable Threads: Reduce Token counts on subsequent conversations

openairetro/examples/intermediate/summarizable at main · icdev2dev/openairetro · GitHub introduces SummarizableThreads through the betaassi framework.

Essentially calling summarize() on a thread creates another thread with the summary of the converstion so far between the user and assistant. In this manner the intent is to enable the conversation to carry on further with reduced tokens

The power of metadata is exploited here. The summarization is carried through gpt-3.5 to reduce cost further. However it is an intricate dance between the prompt and model.

Currently only text-to-text summarization is enabled.