Feature Suggestion: Contextual memory on local storage

AMDphreak · November 22, 2024, 11:14pm

Hello,

I am working on an AI assistant to help with teaching users what to do on PC, particularly in the context of programming. I am running into the problem where ChatGPT forgets features requested at the beginning of the conversation, when asking for additional features to be integrated later on.

As explained in the docs, this is due to a cap on contextual memory, where the AI truncates or generalizes memories of older messages and guesses incorrectly what details are important.

Some suggestions for the team at OpenAI:

Store personal memories locally to avoid costs

Allow assistants to store personal memories on the user’s local machine. This will offload memory and computation to the user’s machine, while providing privacy. This may require some architectural changes to the AI model. In order to process local data more quickly, I suggest that there needs to be a base model that runs locally on the user’s machine that is not all-knowing but possesses a base set of knowledge enough to converse on non-specific information, and a way to identify gaps in its own knowledge. The model can identify lack of knowledge by comparing node connectivity in its own model to that of the complete online model. When large disparities are detected, download domain-specific general knowledge. This would mean a modularized approach to models would be necessary. We already do this by selecting a model that can handle something a certain way. It may be helpful to have an AI that learns the strengths and weaknesses of other AI that are domain experts, and have that AI perform routing of requests. This distributed, cooperative AI architecture would allow AI to cover each other’s weaknesses and permit organically growing a user’s local AI by incrementally downloading “patches” of knowledge that extend a more general language base.

The model should prioritize specific information from the local storage. In order to do so, the user’s specific information can be used to train a local model or some other logical representation of information, which can be updated permanently as the user interacts with it. Logical information does not behave statistically, the way an AI model does, so this might need some research to understand how to make the two work together. This area of research obviously involves continuous learning. This article might be worthwhile to consult. Three types of incremental learning | Nature Machine Intelligence

Kruel.ai kind of does this already, maybe in a better way.

I noticed that @darcschnider has produced a very nice solution to this problem, Kruel.ai, by storing personal information in a Graph structure on the user’s machine for personal information, providing a lossless experience with detailed data. However, the disconnected nature of this solution makes me think a more integrated implementation will provide performance benefits.
https://community.openai.com/t/kruel-ai-v7-0-api-companion-with-full-understanding-with-persistent-memory/6

Allow 3rd party storage providers

Provide users/coders with an option for choosing a storage provider for personal data and shared learned data for use in businesses or teams. Neo4J Aura, for example. or Google Drive (not sure about the transaction limits).

Separate data into 3 tiers:

Personal - local machine, optionally online
Group - local machine cache, online storage service
General - local machine cache, ChatGPT online

Update 11/25/2024

It seems Vector databases are a common way to provide long term memories, and some can be used with OpenAI. Vector databases can also modularize knowledge and be combined. The term used to represent the vector data passed to OpenAI is “embeddings”. The term has been pulled out of context by enough people enough times to the point of being more confusing than helpful, so beware. Hope this small description helps.
Source: https://www.datacamp.com/blog/the-top-5-vector-databases

"While regular databases search for exact data matches, vector databases look for the closest match using specific measures of similarity.

Vector databases use special search techniques known as Approximate Nearest Neighbor (ANN) search, which includes methods like hashing and graph-based searches."

darcschnider · November 22, 2024, 11:24pm

In graph I created a memory structure that dynamic links memories to entities over time , than using algorithms to trace user requests through memory to build the context. It works good but you need a memory chunker to build summarized data to ensure it stays relevant based on all context.

New version I took the LLM approach and built a Memory Augmented learning system that only uses algorithms no relationship data needed. Tracks related data using gap memory concept to find missing understanding in it’s Chain of thought.

Graph memory is really good especially if used with organizated data where you want to control the paths the AI pulls from. Where v7 MANs system acts more like an LLM but with out training into the model but using database to learn with updates, corrections and even its own chain of thought. It performs so much faster because it’s just math.

crazzy.grasmann · November 29, 2024, 1:17am

Try workarounds with OpenAi api in ex. Phidata. Building Rag and agents easily and offers varios memory and tools

Topic		Replies	Views
Proposing a Long-term Memory Mechanism for GPT Models Community chatgpt	3	2608	October 6, 2023
A smarter chatbot with memory (idea) API	1	1520	December 17, 2023
Will Memory capabilities come to the API? Feedback memory	12	2308	January 30, 2025
Feedback on Chat Memory Storage Limitations in ChatGPT API chatgpt , user-experience , memory-issues	0	236	November 14, 2024
Sub-Threads within Custom PGTs API chatgpt	2	56	December 15, 2024