Persistent Memory Anchors (PMA) – Token Optimization for Long Projects

In long-term creative and technical workflows, GPT reprocesses the full message history with every user prompt. This leads to:

  • Excessive token usage
  • Higher latency
  • Faster quota exhaustion
  • Loss of context in new sessions

:light_bulb: Proposed Feature: Persistent Memory Anchors

Introduce PMAs — reusable, user-defined context blocks that are stored and referenced by ID, not reprocessed every time.

They could include:

  • Style guides
  • Story timelines
  • Character profiles
  • Custom tone modules

:white_check_mark: Benefits

  • Drastically reduced token costs in long projects
  • Improved performance and response time
  • Consistent voice and character behavior across sessions
  • Efficient, scalable workflows for novel writing, technical documentation, etc.

:test_tube: Example Use Case

A writer defines a PMA containing key plot points and narrative tone.
Instead of reloading this context in every session, the system silently references the PMA by ID.
Writing remains coherent and consistent — without extra prompt engineering.

This aligns with user experience and computational efficiency goals.

Thanks for considering this idea!