The biggest bottleneck in long-term AI coding isn't intelligence. It's memory

After several months of using ChatGPT and Codex on the same software project, I think I’ve found what feels like the biggest missing feature for long-term development work.

Not a smarter model.

Not a larger context window.

A project memory.

My project has been going on for months. At this point, the bottleneck is rarely code generation. The bottleneck is remembering decisions.

Examples:

  • Why did we introduce LOE tasks?

  • Why was feature #17 postponed?

  • Why does reset planning exist separately from full reset?

  • Which bug was fixed in v0.1.1 and which one was only discussed?

Humans solve this with documentation:

  • Backlog

  • Release notes

  • Architecture decisions

  • Known issues

  • Design rationale

The interesting part is that the AI often helps create those documents, but it cannot reliably use them as its own long-term memory.

So the workflow becomes:

“We already discussed this.”

“Yes, but where?”

Then we spend tokens rediscovering a decision that already exists somewhere.

What’s funny is that on my project, the AI is often extremely good at finding bugs, tracing dependencies through dozens of modules, or auditing complex code paths.

But it can still forget that a feature was completed last week.

Not because it isn’t intelligent.

Because intelligence and memory are different things.

If I could choose between:

  • 2x smarter models

  • a persistent project memory that the model automatically reads before answering

For software projects, I’d take the second option without hesitation.

Am I the only one running into this once a project becomes large enough?

Thanks for sharing your feedback @T.Maillet

As projects grow, keeping track of decisions, completed work, and design rationale can become just as important as code generation itself. The examples you shared help illustrate that challenge well.

I've shared this feedback with the team for consideration.