The Elephant in the Room: Why No Persistent Conversational Memory in LLMs?

njwestburg · February 22, 2025, 4:28pm

You’re right, it’s definitely a complex issue with many layers. Money and tech alone aren’t the magic bullet either…it’s like buying a fancy calculator for a problem that needs a whole new way to think about it

While brilliant people are great i actually think AI success comes from strong teams with diverse skills. (Your comment about future geniuses in primary school made me smile lol)

AI still has a long way to go especially with things like context and truly understanding conversations…but the progress we’re making is still encouraging

njwestburg · February 22, 2025, 4:36pm

You’re absolutely right that developers can build memory features locally and many are doing great work with RAG systems and custom implementations.

But I think we’re looking at different scales here. Sure… I could build something basic for personal use. But creating a robust scalable system that can intelligently preserve knowledge while handling millions of interactions? That’s where the big platforms could really make a difference. Money Money Money/Compute Compute Compute

Not about waiting for OpenAI specifically…more about recognizing that solving this at scale needs serious infrastructure

merefield · February 22, 2025, 4:39pm

I think the basic problem is Transformers don’t support this. So every solution is a kind of tacked on workaround that has pluses and minuses and needs to be optimised for the specific use case.

There’s no magic one-size-fits-all solution afaia.

You are basically asking for fundamental scientific progress which hasn’t yet happened and can’t necessarily be forced? It’s not like you can just throw money at this problem and be guaranteed a solution.

njwestburg · February 22, 2025, 4:41pm

I’ve built up a massive collection of LLM conversations …everything from deep brainstorming to quick Q&As. There are brilliant ideas, meta prompts, and valuable context buried in there, but going back through it all manually would be a nightmare.

What I’m thinking: create a master AI agent that analyzes these chats and identifies patterns like “you keep asking about project management” or “you’re consistently exploring these research angles.” Then have it spawn specialized sub agents focused on each need. Basically build a dynamic AI “team” that evolves based on actual usage patterns rather than assumptions.

jochenschultz · February 22, 2025, 5:55pm

Exactly… or define a few knowledge graphs you are interested in and let a specializes network of agents sort information of each chat into them and rate them.

let’s say you have a python development graph and you use it as a GraphRAG source that extracts the important - hence the rating - knowledge for specific tasks like python development (a decider selects the best subgraph)…

you can use another chat where you talk about results and explain what it did wrong, which changes the rating…

I call it intelligent interns and not agents…
You train them effectivly by creating a prompt machine that fills the context of a prompt.

Which is one part of my autocoder…

njwestburg · February 22, 2025, 7:12pm

Still think there’s room for innovation here…but you’re right …it needs real scientific breakthroughs not just bigger models and more computing power

SanSerious · February 22, 2025, 8:07pm

Looks like another extension to memory is rolling out soon

Opened the app today and was notified I’m alpha testing an extension to the memory feature. This is going to be great

The image is a notification about testing memory improvements in ChatGPT, explaining how past chats are used to inform responses, with options for control and opting out. (Captioned by AI)1284×2650 122 KB

via u/Liconish55 on r/ChatGPT

curt.kennedy · February 22, 2025, 9:04pm

This one is already here. It’s called RAG, as you already know! It’s becoming more long-term in ChatGPT, and it uses more tokens as context, but is not a new technology. OpenAI seems willing to add more history in ChatGPT (via brute force or RAG) to add value. So I think this will be given. But it does involve more infrastructure, and obviously more input tokens, which cost more to run. So depending on RAG / history infrastructure costs and additional computing costs … the lower the better … will determine how fast this gets adopted.

I think there was another mention or hint of a non-RAG way of doing it, that just involves compute, without a bunch of DB stuff, and that would be another front-end “preference model” tuned to each user. These small models could be trained regularly to adapt to the user preferences, past histories, and it can even form “memories” of past interactions that would influence the discussion.

What’s cool about this, is that you could export the weights of this model to another vendor, another model or system, and resume your preferences and memories across other models. This would be assuming such models get standardized, and become easily portable. All without big DB transfers, which would require some sort of ETL unique to each DB, and embedding costs and overhead, nobody uses the same embedding model.

If anything, you should start training your own preference model, and using it to compactly store information about you over time. Then use this in conjunction with RAG to “oversee” the entire generation that the LLM is creating.

sergeliatko · February 22, 2025, 10:22pm

Definitely it will be a team that will make that happen. But I believe it will be surprisingly small team of brilliant people, because smaller teams with “geniuses” having multiple domains in one brain each have a competitive advantage over bigger teams: faster and more efficient communication. And on my opinion, better inter-domain communication will be the key to unlock this result.

jochenschultz · February 23, 2025, 9:38am

wth!! They should have let us export and delete our chats before rolling this out!!!

Export does not work yet!

njwestburg · February 24, 2025, 12:55pm

I have been waiting for my exported data for over a week now

njwestburg · February 25, 2025, 8:41pm

It has now been 9 days. What is going on?

banjo1 · February 26, 2025, 12:59am

Yeah… There is a lot to this question about “conversational memory”… I get that having an enormous amount of data available is resource intentsive and costly… but still, sometimes I think there are some basics that the LLM should have access to… A very, very simple example:… I always am working on a Mac… so, why is it always giving me PC keyboard oriented suggested actions? It seems like it should know the basics about what I am working with… I should think a simple database of basic things It has stored… Not scary Privacy related things, but just basic parameters that will help the LLM when in an interaction. Part of it might be just having a limited history of past conversations to help form some assumptions. I shouldn’t have to keep reminding ChatGPT of the architecture I am working in.

Candide82 · February 26, 2025, 1:00am

If a system like this existed how would as a user would you see other beneficial use cases?

souly.et.music · February 26, 2025, 7:55am

I’m sorry, I don’t mean to be rude, but what is an alpha tester and how do you become one? I have been a beta version from the beginning. And I’d love to try what you got. thank you for the reply.

djmoore7.11 · March 6, 2025, 12:24am

I will say that I just let my Google Gemini Advanced subscription lapse because the Gemini My 2.0 family of models are disappointing in my opinion. But, right before it ended, I did see where they had added this feature and I just kept thinking how useful a feature like that would be with ChatGPT, even if it had to somehow treat past chats It’s like a RAG system.

cpjman14 · March 7, 2025, 11:38pm

I think the answer here is using Holographic principles and mathematics to govern the storage and retrieval of information. Mostly, I think njwestburg is right in that the infra is there, we just need better architecture. using advanced convolutions and weighting algorithms, we can implement near infinite context length.

phyde1001 · March 8, 2025, 6:48am

I’m sorry, I don’t have time to read ALL of this but there seems to be a distinct lack of intelligence on this forum.

The answer is sitting on your shoulders, a little above your nose and ears…

Eugene_Colgan · July 31, 2025, 11:24pm

im just going to say it, i dont care about copywrite anymore,but, you can use local files for persistant memory. if i could build an extension as a hobby then these companies should be able to do it easy

Tijn_Sevens · July 31, 2025, 11:27pm

You have a brain for persistent memory? Besides, personally I don’t really encounter these issues while using GPT. GPT just… resonates, that’s all you need!

Topic		Replies	Views
Persistent, Structured Memory Strategies Community chatgpt	14	877	December 12, 2025
Custom Instructions for maintaining a long-term memory? Prompting gpt-4 , chatgpt , prompt-engineering , custom-instructions	33	20158	October 9, 2024
Memory-First Conversational Architecture as an Alternative to Long Context Windows Prompting api	8	255	March 4, 2026
Episodic and declarative memory should probably be separate in AGI Community	12	1653	January 12, 2022
Custom Instructions to make GPT-4o concise Prompting chatgpt , prompt-engineering , custom-instructions , prompting , gpt-4o	17	16634	February 9, 2025

The Elephant in the Room: Why No Persistent Conversational Memory in LLMs?

Looks like another extension to memory is rolling out soon

Related topics