Passing in of large amounts repeat tokens

cheesybrik · March 15, 2024, 1:13pm

is there some way to mitigate passing in large amounts of repeat tokens for long conversations. When a conversation gets long passing in like 8k tokens per message is a bit much. Is there a feature to help with this in the api or is it just how it has to be.

_j · March 15, 2024, 2:30pm

That’s a question with an answer: the management of what you send to the AI, the length of the conversation, is up to you – when using chat completions and keeping your own history.

With the assistants endpoint, there is a thread that keeps a chat history, and it is without controls or limitations.

So you get to choose: budget vs quality. If you go for budget, you get symptoms “ChatGPT forgot what I was just talking about”.

There are various techniques to extend the illusion of memory without completely sending everything. You can start to expire the assistant responses earlier, as user input is more important for context. You can summarize the oldest part of chat all together with another AI call every few turns, or even asks for individual summaries by cheaper AI when the AI writes at length, putting the shorter version into place after a while. Or a database that can recall old questions when the topic seems to be discussed again (semantic search). You are in control of the messages with chat completions, so you can use your imagination.

cheesybrik · March 15, 2024, 11:14pm

Thank you this was very helpful.

Topic		Replies	Views
Optimizing Input Token Usage in API Conversations Prompting api , api-optimization , prompts , prompt-optimization	2	877	March 10, 2024
How can we count the used tokens in a conversation? API gpt-4 , chatgpt	2	4329	May 17, 2023
A question about the context. May I ask everyone API	3	1693	December 19, 2023
How to handle large amount of text with gpt-3.5-turbo and what happens at token limit? API	1	3230	April 3, 2023
How exactly does plugin chatGPT handle conversations >8k tokens Plugins / Actions builders plugin-development , context-elements	1	562	June 15, 2023

Passing in of large amounts repeat tokens

Related Topics