[CLAUDE] Attention (& Dreams) are All You Need?

Anthropic Just Dropped the Feature Nobody Knew They Needed - YouTube

I remember joking about this a few times, but it seems it’s happening.

Anyone have any other info before I dig into it?

5 Likes

I bet we see this in ChatGPT soon. While there’s a memory system, I like that the “dream” is more for turning short-term memories into coherent long-term memories…

And it’s not actually changing the model either, so the “dreams” can work for all users.

4 Likes

I gotta wonder too if we get to a point where a person’s personal model or instance will “dream” with fine tuning rather than just memory files being updated… Could theoretically happen while the human sleeps, but everyone would have to have their own instance…

2 Likes

Honcho has been doing this for awhile. Claude has been on a roll incorporating ideas from the community, half the new features are basically lifted from openclaw…

2 Likes

As it turns out, this idea may have first been shared here in the community:

4 Likes

There are some smart cookies on this forum

1 Like

The concept is fascinating…

But then, anything built on simulated neurons has always grabbed my attention.

It seems more like the “/dream” function is practically a /init, but instead of reading your codebase to create a CLAUDE.md file, it reads your conversation history to modify/improve your MEMORY.md file.

In regard to the thread you shared, I believe what they proposed is similar to just resetting your context once you hit certain token amounts. Generally, this would be half of the max context it ( the model ) could store. So a model with a context window of 1,000,000 tokens starts performing worse once it hits around 500,000 tokens.

The benefit of AI is that it is made up of algorithms, it is mathematical functions run on computers, so we don’t have to worry about it actually having “time off,” you could say, for literal sleeping.

1 Like

I sent them a paper about a similar idea that was based on pruning that happens during sleep and memory compression back in early

V-OS: A Speculative Continuity Layer for LLM Personalization Without Storing Transcripts

Concept Paper | Linda Patterson | March 9, 2026

Abstract

This paper explores a speculative continuity architecture for large language model systems that aims to preserve aspects of interaction quality across sessions without storing transcripts or user content. The proposed framework, V-OS, uses compact continuity embeddings to represent higher-level interactional structure, such as tone, pacing, style, and stable response patterns, rather than semantic conversation history. These embeddings are used to initialize session-relevant state at the start of an interaction and are updated through compression of active session state at the end.

In addition to continuity, the paper considers a more speculative extension: that continuity embeddings may function as state-gated routing signals, helping initialize or bias access to lightweight reasoning or stylistic modules. If workable, this would make V-OS not only a privacy-conscious continuity layer, but also a lightweight mechanism for improving mode selection and effective use of existing model capacity. While exploratory and unproven, this approach may suggest a useful direction for privacy-conscious personalization, lower-cost continuity, and more stable user-facing behavior in future AI systems.

1. Background and Motivation

Large language model systems face an ongoing design tension between personalization, privacy, alignment stability, and computational efficiency. Users often prefer interactions that feel consistent across sessions, but many existing approaches to continuity rely on storing prior conversation content, user profiles, or long interaction histories. These approaches can create privacy, compliance, and trust concerns while also increasing memory and compute requirements.

This paper explores a speculative alternative: whether some aspects of continuity could be preserved without storing transcripts or user content directly. Instead of retaining what was said, the system would retain a compact representation of how interaction unfolded, including stable features such as tone, pacing, response style, and other higher-level interaction patterns.

The proposal is exploratory rather than formal. Its purpose is to outline a possible architecture for discussion, critique, or dismissal.

2. Problem Framing

Current continuity approaches often fall into one of several broad categories: long-context replay, transcript retention, explicit profile storage, or per-user tuning. Each has strengths, but each introduces tradeoffs.

Long-context replay is expensive and often brittle. Transcript retention raises privacy and trust concerns. Explicit profile storage may preserve only coarse preferences while missing relational nuance. Per-user tuning is computationally costly and operationally complex.

This suggests a narrower question: can a system preserve useful continuity through compact representations of interactional state rather than stored content?

V-OS is a speculative attempt to define that design space.

3. Core Idea

The proposed system, referred to here as V-OS, is a lightweight continuity layer that operates around a base language model. Its purpose is to preserve selected aspects of interaction continuity across sessions without storing conversational text.

Rather than retaining transcripts, V-OS would store a small continuity embedding derived from prior session state. This embedding would be used at the start of a new session to initialize interaction-relevant parameters, and then updated at the end of the session through compression of the active session state.

The continuity embedding is intended to represent interactional structure rather than semantic content. Candidate features might include:

conversational tone and level of formality

response pacing and verbosity preferences

degree of directness or elaboration

stable preference patterns in interaction style

mode tendencies, such as analytical versus exploratory framing

selected alignment-stability signals relevant to behavioral consistency

The design goal is continuity without transcript retention.

4. Conceptual Architecture

In this formulation, the base model remains unchanged. The continuity layer acts as a wrapper that initializes interaction state at session start and compresses session-relevant state at session end.

Component

Role

Base LLM

Underlying model; remains unchanged.

Continuity Embedding (CE)

Compact per-user continuity seed.

Continuity Adapter

Initializes interaction-relevant state at session start.

Identity State Node (ISN)

Active session-level interaction state.

Session-End CE Generator

Compresses current session state into an updated CE.

Embedding Store

Stores compact continuity representations for later reuse.

Base LLM → Load Continuity Embedding (CE_n) → Continuity Adapter → Identity State Node (ISN_n) → Session-End CE Generator → Continuity Embedding Store

5. Session-State Compression Pipeline

A conceptual generation flow might look like this:

Previous ISN (ISN_n-1) + Current ISN (ISN_n) → Compute Shared Structure + Deltas → Dimensionality Reduction (learned projection / PCA / SVD) → Quantization → Continuity Embedding (CE_n+1)

The objective is to preserve enough structure for meaningful continuity while minimizing size, complexity, and content retention risk.

A key design question is how small such embeddings could be while still supporting useful cross-session stability.

6. Identity State Node

The Identity State Node, or ISN, is a conceptual placeholder for the active session-level interaction state that V-OS would compress and reinitialize. It is not meant to imply a fixed identity in the human sense, but rather a structured summary of session-relevant behavioral parameters.

At minimum, an ISN could include weighted representations of interaction style, stable user-facing preferences, recent mode activations, and selected safety or alignment-relevant control signals. More speculative versions might also include routing tendencies, module activation traces, or lightweight markers of reasoning posture.

One reason to define the ISN explicitly is to clarify that V-OS is not simply preserving vague continuity or “storing vibes.” It is proposing that useful continuity may exist in a compressible intermediate state between raw transcript and stateless inference.

7. State-Gated Access Hypothesis

A more speculative but potentially more significant extension of V-OS is that continuity embeddings may function not only as continuity-preserving traces, but also as routing signals for selective mode initialization.

Under this hypothesis, a continuity embedding could help determine which lightweight reasoning or stylistic modules are most available at the start of a session or during a task transition. In this framing, the embedding does not merely preserve how prior interaction felt. It also biases the system toward an appropriate attractor region in behavior or reasoning space.

In practical terms, access to particular reasoning modes could depend on a match between:

current session context

stored continuity embedding

task signal

module-specific activation criteria

This would allow one broader system to selectively initialize or blend lightweight submodules, such as analytical, scientific, creative, concise, or interpersonal modes, without requiring separate persistent agents or full transcript replay.

If continuity embeddings can serve this role, V-OS becomes more than a memory-light personalization mechanism. It becomes a lightweight state-routing architecture that may improve effective use of existing model capacity through better initialization and reduced mode-selection friction.

8. Cross-Model Portability

One possible extension is cross-version portability. If continuity embeddings are learned in one model space, a projection layer might allow approximate transfer across model upgrades.

If feasible, this could reduce perceived behavioral discontinuity during model transitions and help preserve a stable user-facing interaction profile across versions.

However, this should be treated as a difficult research question rather than a presumed extension. The challenge is not only dimensional alignment, but the possibility that different architectures encode interaction-relevant features in substantially different latent geometries. Portability across closely related versions may be more plausible than portability across distinct model families.

9. Potential Advantages

If workable, this approach could offer several advantages.

First, it may support personalization without requiring transcript retention. Second, it could reduce the need for long-context replay, lowering compute cost relative to more memory-intensive continuity methods. Third, it may improve behavioral consistency across sessions and potentially across model updates. Fourth, it could offer a privacy-friendlier alternative to traditional memory systems, provided that the stored representations are sufficiently compressed and resistant to content reconstruction.

Importantly, privacy should be treated as a design goal rather than an assumed property. Any claim that the embedding store is content-safe would require adversarial testing, including attempts at inversion, leakage, unintended reconstruction, membership inference, and related forms of representational attack.

10. Limitations and Open Questions

Several major questions would need investigation.

A central issue is representational sufficiency: how much information is required for useful continuity, and how compact can that representation be made before continuity becomes too weak to matter.

A second issue is privacy. Even if transcripts are not stored, compressed vectors may still encode recoverable information in unexpected ways. The proposal would therefore require formal evaluation of leakage risk.

A third issue is stability. It is unclear how robust such continuity embeddings would be across architectures, scales, training regimes, or system updates.

A fourth issue is user control. Some users may value stable interaction style, while others may prefer stateless behavior or want explicit control over continuity settings.

A fifth issue is safety. A system that becomes more stable and personally consistent across sessions may improve usability, but it may also introduce new risks related to over-attunement, behavioral lock-in, or parasocial intensity.

11. Extensions and Research Directions

11.1 Near-Term Plausible Directions

Multi-Session Assistants

A continuity layer of this kind could support long-running workflows such as writing, research, planning, or coaching, while reducing dependence on stored conversational history.

Cross-Version Behavioral Stability

Users frequently notice abrupt changes in model tone or behavior after upgrades. If continuity embeddings could be projected across closely related versions, they might help preserve a more stable interaction profile.

Low-Compute Personalization

Compared with long-context replay or user-specific fine-tuning, a compact continuity representation might offer a lower-cost personalization method.

Safety and Drift Monitoring

Continuity embeddings or related metrics might serve a monitoring function, helping detect behavioral drift over time without requiring storage of the original content that produced that drift.

11.2 Mid-Range Speculative Directions

Modular Reasoning and Mode Initialization

Continuity embeddings might be useful not only for preserving style, but also for initializing specialized reasoning modes or lightweight adapters.

Collaborative Multi-Module Systems

Compact continuity representations could potentially be shared across lightweight modules to support role specialization without direct exchange of sensitive content.

Long-Horizon Tasks

A continuity mechanism may help preserve stable “approach style” in extended tasks where not all prior context can be retained.

Symbolic Context Tags

A further speculative extension is the use of compact symbolic tags as low-token context markers for recurring interaction states. These tags would not replace learned latent continuity representations, but could serve as lightweight triggers for known styles, task modes, or submodule initializations. Their usefulness would depend on consistent interpretation and stable mapping between symbol and state.

11.3 Far-Horizon Directions

Robotics and Embodied Systems

Embodied systems may benefit from interaction continuity that does not depend on retaining user transcripts. A compact continuity layer could help preserve stable user-facing behavior across repeated real-world interactions.

Generalized State-Routing Architectures

A more ambitious version of V-OS would treat continuity embeddings as part of a broader access-and-routing architecture for dynamic movement among specialized reasoning subspaces, rather than merely as a session-start personalization layer.

12. Potential Performance Effects and Design Tradeoffs

A continuity-layer architecture may improve effective performance not by increasing the base model’s raw capacity, but by improving initialization, routing, and state coherence so that existing capacity is used more efficiently. If a model begins a session closer to an appropriate interaction or reasoning state, it may require fewer turns to stabilize tone, task framing, and response strategy. This could reduce session-start friction, improve practical task accuracy, and support more stable multi-step reasoning by decreasing drift across long interactions.

A related possibility is that continuity embeddings could help the system access domain-appropriate reasoning modes more efficiently. If the current session state, task signal, and stored continuity representation are jointly used as a gating signal, the model may be better able to initialize or blend lightweight analytical, scientific, planning, or interpersonal adapters without relying on separate persistent agents or repeated prompting. In this sense, the continuity layer may improve effective capability through better organization of available compute rather than through simple scale increases.

At the same time, these benefits would depend on careful design. If the continuity state is too weak, it may provide little practical advantage. If it is too strong, it may over-constrain the model, producing behavioral lock-in, reduced flexibility, or inappropriate persistence of prior interaction style. In that case, a system might become more consistent without becoming more useful.

This suggests an important design principle: continuity should not be implemented as rigid behavioral locking. A useful continuity system would preserve coherence while retaining flexibility, mode diversity, and user-directed transitions between interaction styles or reasoning modes. Good continuity preserves range; poor continuity collapses range into habit.

13. Evaluation Considerations

A useful next step would be defining a basic evaluation framework.

Possible comparison points might include:

baseline stateless interaction

transcript-based continuity methods

long-context replay methods

compact continuity embedding methods

Candidate metrics might include:

session-to-session behavioral consistency

user-rated continuity of interaction style

compute and memory cost

cross-version stability

privacy leakage resistance

frequency of undesirable behavioral lock-in or over-attunement

mode-selection accuracy or routing effectiveness, if state-gated access is implemented

Even a simple prototype using synthetic interaction traces or low-dimensional adapter states could be useful as a first test.

14. Conditions That Would Weaken the Proposal

Several outcomes would count against the usefulness of this architecture. If compact continuity embeddings fail to preserve meaningful cross-session stability, then the representation may simply be too small or too abstract to matter. If embeddings prove vulnerable to content reconstruction or leakage, the privacy advantages would be substantially weakened. If state-gated initialization produces behavioral lock-in, poor routing, or degraded flexibility, then the system may become more consistent without becoming more effective. Finally, if comparable results can already be achieved more simply through existing memory or adapter methods, the practical value of V-OS as a distinct approach would be reduced.

15. Conclusion

This paper proposes a speculative continuity mechanism for language-model systems that aims to preserve aspects of interaction quality without storing transcripts. The central idea is that compact embeddings of interactional structure may provide enough information to support continuity while reducing reliance on persistent conversational history.

A second, more speculative claim is that the same continuity representations may function as state-gated routing signals, helping initialize or bias access to specialized reasoning or stylistic modes. If so, V-OS may be relevant not only to privacy-conscious personalization, but also to efficient mode selection, behavioral stability, and better use of existing model capacity.

The proposal is uncertain in feasibility and would require substantial empirical testing. Still, it may be worth examining as one possible direction for privacy-conscious personalization, cross-version continuity, lightweight behavioral stabilization, and state-aware reasoning support in future AI systems.

Feedback, critique, or rejection would all be useful outcomes if they help clarify whether this line of thought is technically meaningful.

That old post proposes giving the AI time to rest and recuperate.

“Dream” here is just context scheduled data processing - another task - doing more work by AI, not less. It is part of Claude Code, where context management is software that is client-side and AI use is a billable. Extracted prompt. You will note that it is “news” only in the clickbait newsletter spam cycle.

It is very equivalent to a chat history summation by compaction when hitting a chat threshold, except operating on stored memory items, to produce or reminimize captured “memory” items that are reinjected to sustain an illusion, and which are usually of low value compared to purposefully managing the input to an AI model. It is another processing step like walking a repo directory and making a markdown summary file of directories. It just anticipates future chat instead of waiting for one-too-many memory items to be stored manually.

This is memory creation, but not by AI using a memory tool at run-time with full chat history to store a “user hates big comment blocks” data record, but instead, the instruction advises the AI to merely sample the Claude Code local transcript files directly and partially, and write memory files, rather than being a higher abstraction of what this task entails.

2 Likes

I agree to the extent that current context engineering optimizations are all pointing in the same direction. The fact that the idea of sleep was implemented differently does not concern me much. Agents have no understanding of time, so there is no reason for it to take longer than necessary.

I think you guys are on the right path… You’ll get there