Summarizable Threads: Reduce Token counts on subsequent conversations

icdev2dev · March 18, 2024, 12:54am

openairetro/examples/intermediate/summarizable at main · icdev2dev/openairetro · GitHub introduces SummarizableThreads through the betaassi framework.

Essentially calling summarize() on a thread creates another thread with the summary of the converstion so far between the user and assistant. In this manner the intent is to enable the conversation to carry on further with reduced tokens

The power of metadata is exploited here. The summarization is carried through gpt-3.5 to reduce cost further. However it is an intricate dance between the prompt and model.

Currently only text-to-text summarization is enabled.

Topic		Replies	Views
Thread Truncation Strategy API	0	578	June 6, 2024
Strategy for chat history, context window, and summaries API	4	6983	December 17, 2023
Token consumption: Prompt tokens exponentially increase when using Threads (Assistants) API assistants-api	8	244	September 5, 2024
Saving API cost in back-and-forth conversational chatbot API	4	1634	December 17, 2023
How to reduce cost of chat like API call API	3	2634	January 30, 2024

Summarizable Threads: Reduce Token counts on subsequent conversations

Related topics