In long-term creative and technical workflows, GPT reprocesses the full message history with every user prompt. This leads to:
- Excessive token usage
- Higher latency
- Faster quota exhaustion
- Loss of context in new sessions
Proposed Feature: Persistent Memory Anchors
Introduce PMAs — reusable, user-defined context blocks that are stored and referenced by ID, not reprocessed every time.
They could include:
- Style guides
- Story timelines
- Character profiles
- Custom tone modules
Benefits
- Drastically reduced token costs in long projects
- Improved performance and response time
- Consistent voice and character behavior across sessions
- Efficient, scalable workflows for novel writing, technical documentation, etc.
Example Use Case
A writer defines a PMA containing key plot points and narrative tone.
Instead of reloading this context in every session, the system silently references the PMA by ID.
Writing remains coherent and consistent — without extra prompt engineering.
This aligns with user experience and computational efficiency goals.
Thanks for considering this idea!