After several months of using ChatGPT and Codex on the same software project, I think I’ve found what feels like the biggest missing feature for long-term development work.
Not a smarter model.
Not a larger context window.
A project memory.
My project has been going on for months. At this point, the bottleneck is rarely code generation. The bottleneck is remembering decisions.
Examples:
-
Why did we introduce LOE tasks?
-
Why was feature #17 postponed?
-
Why does reset planning exist separately from full reset?
-
Which bug was fixed in v0.1.1 and which one was only discussed?
Humans solve this with documentation:
-
Backlog
-
Release notes
-
Architecture decisions
-
Known issues
-
Design rationale
The interesting part is that the AI often helps create those documents, but it cannot reliably use them as its own long-term memory.
So the workflow becomes:
“We already discussed this.”
“Yes, but where?”
Then we spend tokens rediscovering a decision that already exists somewhere.
What’s funny is that on my project, the AI is often extremely good at finding bugs, tracing dependencies through dozens of modules, or auditing complex code paths.
But it can still forget that a feature was completed last week.
Not because it isn’t intelligent.
Because intelligence and memory are different things.
If I could choose between:
-
2x smarter models
-
a persistent project memory that the model automatically reads before answering
For software projects, I’d take the second option without hesitation.
Am I the only one running into this once a project becomes large enough?