How to manage chat history effectively?

I’m using the OpenAI API to build a custom chat system and have a few ideas for handling conversation history:

  • Send the entire message history to OpenAI and rely on prompt caching for optimization.

  • Truncate the middle of the conversation, keeping only the first two and last two messages.

    • Update user expectations based on the latest response.
    • Use a mini RAG system to manage the context of the truncated middle messages.
  • Any ideas for this …

Would these approaches be effective, or are there better ways to handle context efficiently?

1 Like

Prompt caching has a limited lifetime on the server, about 5-60 minutes between queries - with the identical beginning to the messages list that are large enough.

So:

Shorten conversation history more proactively when they have grown long AND when a chat session is re-initiated after an hour and you are no longer going to get a discount anyway.

1 Like

Thanks for your information.

Summary would be good and simpler approach.