Does OpenAI have seemingly infinite context?

The other day, I opened a long chat that I was having with GPT-4 since a few months now related to a project I was working on - where I brainstormed back and forth with it.

I asked it to write up every question I ever asked, and it did, verbatim.

If it could do that, why does the API have such a short 16k token limitation? That’s one thing.

But also, why can’t OpenAI enable a Conversation API where we create a conversation, and send only the last user message to it, while it handles Chat Completion PLUS storage? That is how I initially expected the API to be, but it wasn’t and over time I learned to live with it.

I want to bring this up though. Shouldn’t we expect something like this in the future?