Welcome to the Forum!
You can have a look at these posts to get some ideas on how to approach this using embeddings:
-
Use embeddings to retrieve relevant context for AI assistant: This tutorial focuses on using embeddings to retrieve relevant context for an AI assistant, building upon a simple chat assistant tutorial with the Chat Completions API.
-
Embedding past conversation data for context memory & retrieval: This post discusses different approaches for embedding past conversation data into a vector database for semantic query purposes, including fine-tuning or training embeddings models.
-
Specialized Chatbot with GPT-3: This post breaks down the use of system prompts and embeddings-based retrieval to give a chatbot memory and the ability to provide contextually relevant responses.
-
Infinity Memory implementation: This post discusses the use of embeddings and/or a vector database to retrieve relevant conversations and manage the indices of messages to remove from the message list to conserve tokens.
Feel free to follow up here if a question is not addressed by these posts.