You’re right, it’s definitely a complex issue with many layers. Money and tech alone aren’t the magic bullet either…it’s like buying a fancy calculator for a problem that needs a whole new way to think about it
While brilliant people are great i actually think AI success comes from strong teams with diverse skills. (Your comment about future geniuses in primary school made me smile lol)
AI still has a long way to go especially with things like context and truly understanding conversations…but the progress we’re making is still encouraging
You’re absolutely right that developers can build memory features locally and many are doing great work with RAG systems and custom implementations.
But I think we’re looking at different scales here. Sure… I could build something basic for personal use. But creating a robust scalable system that can intelligently preserve knowledge while handling millions of interactions? That’s where the big platforms could really make a difference. Money Money Money/Compute Compute Compute
Not about waiting for OpenAI specifically…more about recognizing that solving this at scale needs serious infrastructure
I think the basic problem is Transformers don’t support this. So every solution is a kind of tacked on workaround that has pluses and minuses and needs to be optimised for the specific use case.
There’s no magic one-size-fits-all solution afaia.
You are basically asking for fundamental scientific progress which hasn’t yet happened and can’t necessarily be forced? It’s not like you can just throw money at this problem and be guaranteed a solution.
I’ve built up a massive collection of LLM conversations …everything from deep brainstorming to quick Q&As. There are brilliant ideas, meta prompts, and valuable context buried in there, but going back through it all manually would be a nightmare.
What I’m thinking: create a master AI agent that analyzes these chats and identifies patterns like “you keep asking about project management” or “you’re consistently exploring these research angles.” Then have it spawn specialized sub agents focused on each need. Basically build a dynamic AI “team” that evolves based on actual usage patterns rather than assumptions.
Exactly… or define a few knowledge graphs you are interested in and let a specializes network of agents sort information of each chat into them and rate them.
let’s say you have a python development graph and you use it as a GraphRAG source that extracts the important - hence the rating - knowledge for specific tasks like python development (a decider selects the best subgraph)…
you can use another chat where you talk about results and explain what it did wrong, which changes the rating…
I call it intelligent interns and not agents…
You train them effectivly by creating a prompt machine that fills the context of a prompt.
This one is already here. It’s called RAG, as you already know! It’s becoming more long-term in ChatGPT, and it uses more tokens as context, but is not a new technology. OpenAI seems willing to add more history in ChatGPT (via brute force or RAG) to add value. So I think this will be given. But it does involve more infrastructure, and obviously more input tokens, which cost more to run. So depending on RAG / history infrastructure costs and additional computing costs … the lower the better … will determine how fast this gets adopted.
I think there was another mention or hint of a non-RAG way of doing it, that just involves compute, without a bunch of DB stuff, and that would be another front-end “preference model” tuned to each user. These small models could be trained regularly to adapt to the user preferences, past histories, and it can even form “memories” of past interactions that would influence the discussion.
What’s cool about this, is that you could export the weights of this model to another vendor, another model or system, and resume your preferences and memories across other models. This would be assuming such models get standardized, and become easily portable. All without big DB transfers, which would require some sort of ETL unique to each DB, and embedding costs and overhead, nobody uses the same embedding model.
If anything, you should start training your own preference model, and using it to compactly store information about you over time. Then use this in conjunction with RAG to “oversee” the entire generation that the LLM is creating.
Definitely it will be a team that will make that happen. But I believe it will be surprisingly small team of brilliant people, because smaller teams with “geniuses” having multiple domains in one brain each have a competitive advantage over bigger teams: faster and more efficient communication. And on my opinion, better inter-domain communication will be the key to unlock this result.
Yeah… There is a lot to this question about “conversational memory”… I get that having an enormous amount of data available is resource intentsive and costly… but still, sometimes I think there are some basics that the LLM should have access to… A very, very simple example:… I always am working on a Mac… so, why is it always giving me PC keyboard oriented suggested actions? It seems like it should know the basics about what I am working with… I should think a simple database of basic things It has stored… Not scary Privacy related things, but just basic parameters that will help the LLM when in an interaction. Part of it might be just having a limited history of past conversations to help form some assumptions. I shouldn’t have to keep reminding ChatGPT of the architecture I am working in.
I’m sorry, I don’t mean to be rude, but what is an alpha tester and how do you become one? I have been a beta version from the beginning. And I’d love to try what you got. thank you for the reply.
I will say that I just let my Google Gemini Advanced subscription lapse because the Gemini My 2.0 family of models are disappointing in my opinion. But, right before it ended, I did see where they had added this feature and I just kept thinking how useful a feature like that would be with ChatGPT, even if it had to somehow treat past chats It’s like a RAG system.
I think the answer here is using Holographic principles and mathematics to govern the storage and retrieval of information. Mostly, I think njwestburg is right in that the infra is there, we just need better architecture. using advanced convolutions and weighting algorithms, we can implement near infinite context length.
im just going to say it, i dont care about copywrite anymore,but, you can use local files for persistant memory. if i could build an extension as a hobby then these companies should be able to do it easy
You have a brain for persistent memory? Besides, personally I don’t really encounter these issues while using GPT. GPT just… resonates, that’s all you need!