To explore this idea, I sketched out a very simple conceptual benchmark to test this intrinsic capability. I’m calling it the Knowledge Maze, and I’d love to get your feedback on whether this makes sense.
How it might work:
The Environment: A multi-turn decision game. Each turn, the LLM is given its current location (a string marker) and a few doors (A, B, C…), each with some text hints. It chooses a door to proceed, aiming to reach a destination.
The “Tool”: Throughout the game, the LLM has free read/write access to a local .memory/ directory. It can leave notes, write rules, or read past experiences completely at its own discretion.
The Dynamics: The mazes have underlying rules that are correlated across different games but will organically shift over time (simulating concept drift or knowledge expiration).
The Execution: We interleave different types of mazes and have the LLM play them consecutively.
The Metric: Instead of just measuring the success rate, what if we looked at the Total Token Consumption required to clear a series of mazes?
A model without active memory management might waste tokens on repeated mistakes, clinging to outdated rules, or redundant exploration.
A model with strong, intrinsic memory skills would presumably write highly efficient summaries, adapt to rule changes quickly, and clear the mazes using significantly fewer tokens.
I’m sharing this merely as a starting point for discussion. Do you think intrinsic memory management is the right direction for us to focus on? Are there similar benchmarks or post-training approaches currently being explored to encourage this kind of autonomous, long-horizon memory?