Hi, the question is maybe related to my own: How to force to continue a truncated completion?
There are some techniques to allow having longer term memory than the token limit.
some approaches I saw implemented by many, e.g. see @daveshapautomator youtube channel:
- consider the chat prompt as a sliding window: you take in the prompt last N turns, deleting oldest turns,
- when you are near the token limit, summarize conversation turns in a initial prompt
- when contents are huge amount of data, use a semantic search approach where you split your data/documents in slices/files you retrieve with a semantic search (using embeddings, see openai cookbook for examples) and you insert the summarized data into the initial resubmitted prompt.