Thread Truncation Strategy

martin.otero · June 6, 2024, 3:58pm

Hi, I have implemented my own version of threads and aiming for efficiency I summarize the whole conversation into a compact summary loosing all non important, redundant tokens. (just asking gpt to make a detailed summary) Then I resume the conversation using the summary and the newest interactions and summarize and replace the previous summary with and updated one again after the reply is generated this prevents the exponential growth in tokens to use for each run and keeps the number relatively flat from run to tun. I am defining the summary truncation strategy with a summarization prompt that tells the system what is important to keep in the context of the specific agent. Is that strategy something you could consider adding to the standard threads object?

Topic		Replies	Views
Add smarter controls to truncate Thread chat history (Assistant API, Runs API) API threads , assistants-api	0	1002	June 28, 2024
Strategy for chat history, context window, and summaries API	4	8801	December 17, 2023
Summarizable Threads: Reduce Token counts on subsequent conversations API assistants-api , cool-project	0	332	March 18, 2024
Token consumption: Prompt tokens exponentially increase when using Threads (Assistants) API assistants-api	8	910	September 5, 2024
Does threads only truncate previous messages, or does it chunk the chat history when there are too many? API assistants	8	3020	February 10, 2024

Thread Truncation Strategy

Related topics