As I’m working on using fine-tuning to bring my RAVEN model into full functionality, I’m approaching the point of revisiting memory. In humans, these are separate systems. If evolution chose to keep episodic and declarative memory separate, there must be a reason, some advantage or limitation.
While they are both memory, perhaps their actual operation is very different. Certainly, the way we accumulate them is different, as is how we use them. Episodic memory is used to build a mental model of self and others, to construct a narrative history of our existence. Declarative knowledge, on the other hand, is auxiliary - merely used as a tool to support our journey through life.
It’s far more important for me to remember that I went into the office for the first time yesterday since the pandemic began. It’s important for me to remember the conversations I had. It’s much less important for me to recall the date of the explosion of Krakatoa, 1883, and yet I can do both. Since we can employ both episodic and declarative memory at the same time and often with equal ease, I figured they could be implemented as the same system in AGI.
However, I’m recalling some experiments I did. One such was using GPT-3 to just spit out random facts. It’s great at that, and would win at any game of trivia. But it can also confabulate. It’s this tendency to confabulate that worries me. Perhaps the episodic system should use low temperature so it simply reads and regurgitates memories from the database. In another experiment I attempted to use gpt-3 to discern episodic from declarative memory, and it utterly failed. I created a little scenario and asked it “Is Dave on fire?” And it says “yes because I can see the flames”. Perhaps with fine-tuning, it could handle such tasks better. But now I’m wondering if GPT-3 should be involved in memory at all? Why not just search the database and transcribe memories verbatim? This would be faster and cheaper, after all. I’ve also hypothesized that AGI episodic memories should be stored in a blockchain so that they cannot be tampered with. (You may recall that tampering with AGI memory was a major plot point in Westworld). It would make sense to store episodic memories in a blockchain, but perhaps not declarative knowledge.
However, the biggest problem with declarative knowledge is the question “what’s actually true and how do you know it’s true?” Fortunately, I don’t think this is actually a problem, so long as you handle it correctly. For instance, if you record metadata in your knowledge database you can keep track of who said what and when. For example, you might record a fact as having come from Wikipedia. You can then also look up the reliability of Wikipedia, as well as cross reference multiple sources. Finally, by just giving GPT-3 all this information, you can ask it how reliable the information is. It can handle ambiguity and questions of epistemology. Perhaps this difference underscores the main functional difference between episodic memory and declarative knowledge: episodic memory is taken as true and unquestioning, even if you can reinterpret it later. Declarative knowledge is never taken as true and must always be cross referenced and interpreted with a level of doubt.
This is a tough nut to crack and there are even harder nuts out there, such as cognitive control. How do you know when to stop talking and listen during an interruption? How do we make split second judgments to avoid danger and harm? These interruptions are something that I have yet to figure out with NLCA. But it’s apparent that our brains are always prioritizing our attention. Once I get to the point of integrating cognitive control, I might have to read On Task again. For an example of cognitive control, imagine you’re unloading your dishwasher and a knife slips from your hand. It’s heading straight for your toes. You instinctively stop everything and yank your foot out of the way. You don’t just continue and ignore the falling knife. In another example, you start to speak at the same time as someone else, but you make a split second decision to stop and listen. (it should be noted that this last task is particularly difficult for people with ADHD, which is a deficit of cognitive control, among other things). Anyways, this all clearly underscores a need for an interrupt system within an AGI but that’s a problem for future me. First, gotta figure out memory.
This is set to “looking for teammate” because I am! I need help with some of these implementations. Let me know if you’re interested.