When you say “why do you let your language dictate the way you think?” → do you mean ccore or English? Now as a pair the answer is the reverse. If I said the smallest tile of the fractal comes from the code at compile time – that is not language doing the thinking it is the compiler crystallising what you thought the language should be thinking.This propagates through the system and I said it is in CNP (cosmos notational protocol). The acronym COSMOS comes from code orthogonally structured in mathematic object space. Because in CCore orthogonality is in the language as well. Which means no runtime gymnastics, no mutex no wait ==parallel in theory is parallel at execution. Now CNP as I said before encodes the same bytes as vector (salience) projection, Graph and DAG and dovetails through the system. So yes language is the starpoint of the full systems thinking as well as audit chain. But that fractality helps you check or audit whether the architects intent meets the thinking in the language. So the language manifests the thinking → this is ‘dovetailed’ through the full system. Now if it is the English language you are talking about then the problem is also the inverse; the language encapsulates the thinking that the beholder derives. So it is a proposition of relativity
English, because of how the logical patterns are too “inside the standard box” of language expression (often happens during your thinking process when you convert the concepts you operate with into language too early, before they are mature enough to be defined better), which made me think either you are not human or you are heavily leveraging a “grammar book” (LLM) to craft your answers .
When you say: often happens during your thinking process when you convert the concepts you operate with into language too early, before they are mature enough to be defined better–> I tried to give you a conceptual level paradigm comparison. But what you state above seems to be a newer detention or a more elegant attempt at defining confabulation which I hope wasnt there in the language.
Yeah, that’s the thing. An LLM turns vector space into a line of words. We do the same when we turn an idea into a sentence. Confabulation is running that conversion too early, while the idea is still missing pieces. The words come out clean and hide that the concept wasn’t ready.
The machine is built to give you an answer, and that pressure is what pushes it to convert too early, which is where the confabulation comes from. Personally I call it hallucination. The fix is the same for us: when the concept isn’t ready, don’t convert yet. Loop back through it, let it finish, then put it into words.
Or if the conversion already happened, sometimes it’s a good thing to step back and question: was it done too early?
In vibe coding this is exactly the problem when AI starts writing code without having a clear picture of how the whole system should work. And basically that code is based on the average common practice learned by pattern recognition in the training material given to the LLM… (the grammar book on steroids)
I want to step back from the back and forth and place this against where the industry is, because the gap I keep circling only shows up once you give the field credit for what it already got right.
Where the industry is today
Two things came together this past year.
On the building side, spec-driven development mostly won the argument that code is not the source of truth. You write the spec, and the code gets built from it. Microsoft, GitHub, AWS, and most of the coding-agent tools ship a version of this now.
On the running side, agent governance moved just as fast. An agent’s authority has to be granted and provable, not assumed: scoped permissions, a real identity, an audit trail of what it did. The identity vendors say it plainly, and the first national frameworks landed early this year.
https://www.entrust.com/blog/2026/05/ai-agent-authorization-delegation-zero-trust
The field already agrees on a lot of what this thread has been circling. Intent has to be written down. A plausible answer is a long way from a true or authorized one. Drift is real.
The gap I see
Each of these picked something to treat as the thing you can trust, and built everything on top of it. I am not aware of anyone checking whether the thing they picked can actually hold that weight.
Spec-driven development trusts the spec. But a spec is just something a person or a model wrote down, and writing something down does not make it true. The field’s own writers admit specs leave out the reasoning, and that a generated spec can read perfectly while describing a system that does not exist.
Governance trusts identity. But all an agent can ever prove is that it presented a credential. Nothing tells the system who was actually holding that credential at that moment, so “this is agent X” is a guess that falls apart the moment a key is stolen.
And there is a problem sitting above both of them. Most of these tools are trying to catch an agent that grabbed a familiar-looking solution from its training and ran with it before anyone pinned down what the actual problem was. A cleaner spec later cannot undo a wrong turn the agent took before the problem was even stated.
What I think about it
What is missing is a single trustworthy bottom layer that knowledge, authority, and structure can all stand on.
The only things a system can truly know are the things it directly witnessed: that it exists (otherwise it cannot create a record), that something connected to it (connection log record), that some input arrived (request/event log), that some condition was met (code run log).
Those are facts. Everything else should start out as a guess and earn the status of “known” by pointing back to one of those witnessed facts. Identity included: instead of being the foundation, it becomes one more guess resting on what was actually witnessed. Accountability gets easier too, because you can pin responsibility on what the system saw happen instead of on who it assumed was there.
The other half is that the same pattern repeats at every size. A small task is just a small problem; a whole system is just a big function with smaller ones inside it. So you can describe all of them the same way. You state a problem as the gap between how things are now and how you want them to be, and the solution is the set of steps that close that gap. That same move runs the whole way down, from the top-level problem through the architecture and the individual pieces into the actual code. The catch is that the reasons behind each step only survive the trip down if you write them down as you go. Lose them, and whoever arrives later, human or agent, fills the silence with a guess, and that guess is where things drift.
How I am approaching it right now
Two ends of that chain are working, and there is a piece in the middle I want help thinking through.
At the top, I am building a tool that makes the agent slow down before it writes anything. Before it can propose a solution, it has to state the problem, name the gap between where things are and where they should be, and work through the steps as hypotheses it has to defend, with a separate reasoner keeping it inside the logical rules. In practice it makes the agent think instead of autocomplete based on pattern matching (note me calling LLM as “grammar book” and using “think” here without quotes in the same time lol, if by thinking you mean reasoning using hard logic rules while operating on information; different from “find most plausible human-like thought chain to prove your current conclusion based on the pattern you’ve spotted”) This one is at working-prototype stage.
At the bottom, there is an architecture checker. It watches whether the code still matches the architecture you declared, keeps the reason for each piece written down next to that piece, and is honest about what it cannot check rather than pretending. It runs, and it has already caught real drift in testing that nobody was looking for. Like the other tool, it works but it is not ready to go public yet.
The middle: memory
Memory is the part I am least sure about, so it is where I most want other people’s thinking.
The usual approach keeps a big store of past context and tracks how fresh each piece is. That helps, but it still hands back whatever it retrieved as if pulling something up made it true.
I want memory held to the same rule as everything else. Something you recall is not automatically true. It comes back with a status attached, so it can be a guess, accepted, contradicted, stale, or replaced by something newer, the same way any other claim would. When the system briefs an agent, it says plainly what it actually believes, what is just background, where each piece came from, and whether it is recent enough to rely on for the work at hand. Ranking can suggest what to surface, but it does not get to decide what is true. And forgetting means something stops coming up first, not that it gets deleted, so nothing is ever actually lost.
What I am still “stuck” on is how fast to let material drop out of easy reach, and how to keep each recalled item’s status honest without making every lookup slow and expensive.
Closing
I am not aware of anyone else building one trustworthy bottom layer that the spec, the permissions, and the structure all sit on, with memory held to that same rule. If you have seen something like it, point me there, I would rather compare notes than guess.
And if any part of this lands, or sounds wrong, I want to hear it: the bottom layer, the repeating problem-to-code shape, the slow-down tool, the memory model. Tell me how you handle the same things in what you are building, what worked, and what broke. That is what I am here for.
P.S. links are just some samples, not the “final truth”
I think I should reply and with more detailed explanation as brevity is leading to assumptions of confabulation (because succintising a lot is causing information loss)—Serge, what you have built is great and does a very good job for a disciplined developer like you to author intent and architecture and validate that this persisted, and I got interested because fractality and how fractalising the issue is the approach was also one of the principles in my system, although that is probaly where the aims and similarity end.
I am an Orthopaedic surgeon and I have no computer science or math training. I rely on AI to actuate my concepts. I spend time on the design phase and converge on several ideas that are in markdown format. Then I convert them into conherent ‘Runbooks’ achieving stepwise goals twards and end shape or resulting app/component. The Runbooks are generally organised into gated phased or gated development plans(steps) with DOD (definitions of done) for each gate and a predefined validation matrix to check at each gate along with an ai introspection audit of completed works using ATAM and 6 hats (de Brouin) framework so that we ensure that the Pareto optimum for that component has been achieved. If it hasn’t and it is due to an overlooked item then it auto tiggers a RCA(root cause analysis) loop. If resolution is achieved the runbook continues if not it stops and I get to check this.
This means I actually keep long horizon Runbooks running while operating (I mainly do hsoulder, hip and knee surgeries) and just check between cases or patients. The system supports AI meta cognition and under no circumstance should curtail the AI’s intelligence— because the AI will have seen patterns that I never knew existed (as part of its training) because I am not domain trained. I don’t want to deal with code or the mathematical or algebra flow: I just want to be the architect. The rest of the system should be automated and give me formal methods approximation accuracy without a theorem prover in the loop (that is actually the main product that I have nearly completed now) and drift detection is a necessary component of my assurance arc, and is built in (built in doesnt mean it is perfectly built in —but rather approximately so) — but, my drift system has a different mechanism and my remaining assurance system that has tiered pressure testing including quasi Monte Carlo(Sobol for practicality). Git indiscipline and git anxiety (agent boxes itself out of safe merge bcause ouf boundary defenition issues and syntactic tree cheap visibility issues) is a significant issue: Git and the testing and remaining assurance arc are tied to drift and I will explain why I feel so (now again, I am not a software engineer —everything I am saying is based on my difficulties faced in my workflow and my ignorance will defenitely manifest):
Drift is the delta between what we thought would exist at a future point-in -time versus the ground reality at that time point (it is a point-in-time delta, the ‘why’ of that delta exists in the past and accumulated like microplastics). It can be topological and functional and often both. If you are a software engineer and your codebase is managable I guess it would be able to easily pinpoint drift. But for me where I dont even intend to inspect the codebase the drift exists because
- Observable Drift—>What I defined didnt get coded the way I envisioned; (multifactorial in origin -but capturable)
-
- This is not always negative —because the AI found a fix for an issue you didnt encode in your design but it got silently shipped as that was the best at that time
- Negative drift can also be a practical compromise that was achieved because
-
- Unobservable Drift —> Exists because i didnt define what shouldnt exists (practically hard to capture—becuase borders on unknowns—but actually my testing system SENTRY aims to reduce the blind spot by using a diffrent limb of the same fractal chain and a testing philosopy based on avoiding what I call the ‘sea of green’ effect; circular assurance crated by the oracle and the author being the same)
-
- defining invariants and bans can mitigate this and in ccore as a language you have to write bans (what it shouldnt do) and invariants.
- The drift capability resides in the blid spots of your assurance arc (testing system, runtime simulator, safe package manager -another less considered vulnerability spot)
- Tautological assurance loops
-
That is some essential background because brevity is not helping here and leading to accusations of confabulation. With that workflow of mine in mind let me explain why I said my fractalisation starts and proceeds in a different way
- My initial design (intent) is a projection of my belief of the Pareto optimum that will meet my intent based on known knowns - my knowledge is limited based on research and vision tunneled to within my conceptual boundaries at that stage
- This intent has a geometric shape and is not scalar or flat. It has salience (vector), graph (major nodes and visible edges), and optimally should help pint to next legal route and eliminate need to retraverse (DAG where looping is not the plan, although DAGs can be recursive and sometimes connected to loop where needed: what is more important than naming it as a DAG is understanding that it holds next optimal route knowledge)
- As the Runbooks progress unanticipated realities pop up—>Some are auto sorted by the AI, some get sorted when I have to get involved and drift is happening from the canonical intent documents. The delta’s are small
- But drift is not necessarily negative in this context. It is resolution of an issue either through a compromise or detection of a better solution. The solution may appear regressive but multi factorial consideration at that point may have forced a Pareto optimum on perturbation testing and practicality considerations. It would have triggered a think cycle a root cause analysis and even a subrunbook or design document authoring.
- The lack of apparent drift doesn’t necessarily ensure no drift- it only mitigates non deviation from codified parameters. The minutia of design definition of those codified parameters determines the floor
- Documents and artefacts are now sprawling. The attention mechanics of the AI will get overwhelmed unless zoom ability or drill ability exists for the evidence chain. This zoom ability has to be done so that more serious issues and drift float to the right scheme of priorities
- The Why of this drift is more important than the distance of the drift and is just one of the multiple factors to consider in the analysis of the repo state. The attention mechanics of the AI is better helped if the repo state itself is geometrically projected (vector of priorities and tested states, graph of components and functionalities and their states and dependencies , and DAG like effect of next sequential directions) and at different zoom ladders
- In summary intent is a point in time slice of your conceptualisation, which has trajectory and temporality (is expected to drift) and geometry (vector/graph/DAG)
- An AI’s certification of the adherence to design does not guarantee internal algebraic conformance to deterministic design principles- it ascertains that a test with those known knowns passed (tautological reassurance)
- Your testing system should also have the capability to show the unknown unknowns as well by offering perturbation and a quasi monte carlo is only feasible if the delta of the known knowns is well defined. Fractalit actually makes this possible. The reason is I am not a domain expert. circular testing and providing a ‘sea of green ticks’ because all the knowns were circularly verified is in my opinion one of the main reasons for bugs shipping in AI code and drift persisting because the delta of that drift was not defined.
- The fractal tile should be in the same governance and be a component of the ecosystem itself with lineage tracable to the compilers emitted outputs as well so it is not a parallel system but one singular artefactual and policy family that coe-exists with code and passively produced as much as possible, not actively authored by human architect
I hope this shows the fractal I was talking about
It is for someone like me who has a gross idea of what the project design start point is but accept that that is just the start point — I want vibe coding but with formal methods certainly and auditability and intent tracking is one component of it - but we maintain fluidity and flexibility in our understanding of drift and make sure that drift tracking should not become straight jacketing that constrains the intelligence of an AI because it has seen and correlated several patterns I have never thought of from prior training. Hueristics should only provide gentle guardrails and frameworks without constraining poly hedral branch ability. Also in all experiments I ran - I found that semantic loss occurs at the boundaries. In other words the same artefact propagating and mutating through the same task epoch is more semantically compressible without information loss so I aim to propogate the same fractal tile for as much as possible.
So in summary what you designed works for you. What I designed works for me. I see that you have done a good job with regards to your design. Mine existis on a diferent principle and aims to capture drift passively because I am lazy to completely define it but demanding that AI do the job for me with hueristic assistance in a way that hueristics doesnt straightjacket the AI from drifting in a positive way that will reveal solutions I didtn think about, and the AI assits in maintanence of the ledger as to why - not the lazy me annotating on the go because software architechting work is my hobby not my source of income:
┌─────────────────────────────────────────────────────────────────────────────┐
│ COSMOS / CCORE — INTENT, EVIDENCE, DRIFT │
│ │
│ Fractal project memory for AI-led architecture evolution │
└─────────────────────────────────────────────────────────────────────────────┘
┌──────────────────────┐
│ 1. ARCHITECT STEERS │
├──────────────────────┤
│ • north star │
│ • constraints │
│ • desired shape │
│ • runbook direction │
└──────────┬───────────┘
│
▼
┌──────────────────────────────┐
│ 2. AI EXPLORES & BUILDS │
├──────────────────────────────┤
│ • proposes routes │
│ • writes / changes code │
│ • hits reality │
│ • discovers unknowns │
│ • compares alternatives │
│ • converges on a next move │
└──────────┬───────────────────┘
│
▼
┌──────────────────────────────┐
│ 3. CNP INTENT TILE │
├──────────────────────────────┤
│ A compact packet records: │
│ │
│ • what was decided │
│ • why it mattered │
│ • what was rejected │
│ • what evidence existed │
│ • what would later test it │
│ │
│ Not the whole transcript — │
│ just the recoverable “why.” │
└──────────┬───────────────────┘
│
▼
┌──────────────────────────────┐
│ 4. μLEDGER TRAJECTORY SPINE │
├──────────────────────────────┤
│ The intent tile is linked to │
│ a small append-only record: │
│ │
│ • previous ledger link │
│ • source intent hash │
│ • git / custody link │
│ • runbook or design ref │
│ • evidence receipt ref │
│ │
│ It records why the project │
│ geometry moved. │
└──────────┬───────────────────┘
│
▼
┌──────────────────────────────┐
│ 5. CCORE GIT │
├──────────────────────────────┤
│ Captures project movement: │
│ │
│ • commits │
│ • active lanes / leases │
│ • changed files │
│ • custody │
│ • linked evidence │
│ │
│ This answers: │
│ where and when did it move? │
└──────────┬───────────────────┘
│
▼
┌──────────────────────────────┐
│ 6. SENTRY TESTING SYSTEM │
├──────────────────────────────┤
│ SENTRY tests the delta and │
│ maintains a CNP test map. │
│ │
│ It can run: │
│ • normal tests │
│ • replay tests │
│ • pressure tests │
│ • perturbation tests │
│ • bounded unknown-unknown │
│ exploration │
│ │
│ It answers: │
│ does the current project │
│ still satisfy the intent? │
└──────────┬───────────────────┘
│
▼
┌──────────────────────────────┐
│ 7. CCORE REPOSTATE │
├──────────────────────────────┤
│ AI-facing current-state view │
│ projected from the evidence. │
│ │
│ It summarizes: │
│ • satisfied │
│ • violated │
│ • unresolved │
│ • stale │
│ • positive drift │
│ • never materialized │
│ • next legal action │
└──────────┬───────────────────┘
│
▼
┌──────────────────────────────┐
│ 8. RE-CONVERGENCE │
├──────────────────────────────┤
│ The AI and architect continue│
│ from the current truth: │
│ │
│ • continue if satisfied │
│ • repair if violated │
│ • investigate if unresolved │
│ • update design if a better │
│ route was found │
└──────────┬───────────────────┘
│
▼
┌──────────────────────────────┐
│ 9. LIBRARIAN │
├──────────────────────────────┤
│ Keeps only critical residue: │
│ │
│ • key design decisions │
│ • critical RCA / roadblocks │
│ • runbook evidence │
│ • non-reconstructable why │
│ │
│ Reaps routine noise. │
└──────────┬───────────────────┘
│
▼
┌──────────────────────────────┐
│ 10. FUTURE AI PICKUP │
├──────────────────────────────┤
│ A future AI can ask: │
│ │
│ • what changed? │
│ • why? │
│ • what evidence proves it? │
│ • where did drift occur? │
│ • what remains? │
│ • what should happen next? │
└──────────────────────────────┘