Kruel.ai V8.0 - V9.0 (experimental research) - Api companion co-pilot system with full understanding with persistent memory

Research Summary

I’m currently researching this to gather insights from other AI engineers and assess its feasibility for our use case. Our goal is to evaluate what would be required for conversion and scaling to determine whether it’s a viable direction. I’ll share the findings once the initial evaluation is complete.


:police_car_light: Key Concerns for Your Use Case

ACID Transactions: Not yet confirmed — this is critical for maintaining chatbot data integrity.
Real-Time Updates: The immutable architecture may present challenges for systems requiring frequent updates.
Write Performance: Unclear — the system appears optimized for read-heavy workloads rather than write operations.
Concurrent Writes: Potential issues due to immutable DataParts structure.


Why Neo4j Remains the Stronger Choice

  • Proven ACID compliance, ensuring transactional integrity.

  • Optimized for real-time updates and concurrent access.

  • Mature and well-supported ecosystem with robust tooling.

  • Record-oriented storage better suited for frequent data updates.

  • Demonstrated performance reliability for this specific use case.


Additional Context

Further research is needed to confirm whether ACID transactions are fully supported. Other aspects, such as write performance and concurrency, would require direct testing to produce reliable benchmarks.

In my earlier research on ontology-driven graph databases, these considerations were key factors in selecting Neo4j. I’ve also completed extensive training in Neo4j, particularly in the Graph Data Science (GDS) library, which includes a rich set of built-in algorithms and analytical tools. Versions 4 through 5 of my systems utilized GDS effectively, but I’ve since developed faster computational methods that have significantly reduced processing time.

That said, data retrieval has never been my main bottleneck it’s typically processing speed (tokens per second). Even with the OpenAI API, throughput remains constrained since I build my own reasoning layers rather than relying on opaque, closed LLM reasoning models. This allows me to fully track and understand how my system derives its conclusions through the custom-built reasoning and GAP memory layers.

I prefer developing these foundational components myself whenever practical. While it’s not always the fastest approach especially in competitive environments it provides a deep sense of fulfillment. For me, the challenge isn’t about outperforming others; it’s about testing whether I can build something as capable (or better) than what exists today. It’s a self-driven pursuit of mastery part science, part art, and wholly rewarding.

The downside to all this though is madness in that you could end up infinite coding to always find a better way hence all the versions here over time are that concept of build test love, than blow it up and rebuild better with all that understanding. :slight_smile:

I dislike that we can’t past small direct videos here haha. so use to discord :stuck_out_tongue:
We posted a video of kruel.ai code rendered with some simulation physics as well a pipeline diagram to show the complexity of the whole system.

The outer unlinked are smaller applications and test applications which are not related to the core system.

Look what has arrived :slight_smile: - Kruel.ai is migrating. Spent this evening playing around with models and man this is exactly what I wanted for the system. Now I am looking already at the station lol. More power!

Images of current Lynda’s brain before the new Lynda is born. we did not too bad. This one will stay on the computer local, the DGX will get a new brain started. and we will start to explore local voice models using one shot clone to see if we can make emotional local voice. We have a lot of training power in this to play around. Already tested a local music model which was neat. not anywhere near what we are using now. but still pretty cool.

3 Likes

Almost fully running, its almost like I knew this was going to drop right in :wink:

Just missing a few scripts that I am still moving over and we should be good to start playing.
need to get git all setup so the system can update it code etc. going to be wild.

Update: getting late haha but its up and running with offline models. needs a few minor changes to make it all work. its cool all running on Nvidia Sync so all my machines that I allowed are all connected. We also are running now peer-to-peer encrypted networker allowing full access from anywhere the devices are connected to internet so no edge exposed its all tunneled. Something that came with the DGX playground stuff.

Also have browser chat now working but limited as we are not using SSL yet as its running p2p encrypted network so direct. may look at some self signed for testing actual voice/mic/camera but chat works which is nice. Running over the cell network no wifi :slight_smile:

Shipping Notes: Web Frontend for V8.2, Mobile UX, and Performance Gains

I started early this morning (3:00 a.m.) to wrap a lingering debug item and finalize work from yesterday’s build cycle. While part of the team handled other software tasks, our AIs continued provisioning and validating the new server builds in parallel. Below is a concise update on what shipped, what changed, and what’s next.

What We Shipped

Web Frontend for V8.2

  • Context: V8.2 has been a desktop-first client. To support cloud accessibility and a wider range of devices, we built a dedicated web frontend.

  • Mobile-first scope: We focused on features that make sense on a smartphone: fast load, minimal taps, and responsive layouts.

  • Voice input change: We removed Whisper on the web; modern mobile keyboards already offer reliable microphone input, which is sufficient for short prompts and messages.

Personal Creation & Editor

  • We exposed the persona/entity creation and editor directly in the UI. This should always be available to users for constructing and refining entities, preferences, and roles without engineering support.

Music Deck (Web)

  • We introduced a lightweight music deck for web that lists your generated tracks and supports background play (you can close the deck and keep listening).

  • Planned enhancement: Implement automatic ducking—when TTS is speaking, the music volume lowers, then returns to its previous level.

Platform & Infrastructure

File Upload Pipeline (Vision + Docs)

  • File upload is core to vision and document understanding. We’re finalizing a more robust path that avoids port conflicts introduced by locally bound models.

  • Action: We’re moving local model services into Docker to align ports within the Docker network and eliminate external bindings. This simplifies deployment, reduces conflicts, and keeps networking contained.

Performance Snapshot

We’ve been tuning steadily, and current end-to-end response times are competitive with fast mid-reasoning stacks:

  • Typical latency: ~10–28 seconds, depending on data volume and context breadth.

  • Example: Querying across one month of data can take ~28 seconds, influenced by topic complexity, linked historical context, and the density of related discussions.

Hardware Notes

Our compact GPU workstation (“DGX Spark” in our lab notes) continues to punch above its weight for a dev lab. The experience has been compelling enough that the next step is a DGX Station–class upgrade to scale concurrent workloads and multi-user throughput.

What’s Next: P2P Alpha & Research Protocol

We’re preparing peer-to-peer testing with a small group of trusted Alpha members. This will be a research-style rollout:

  • Deliberate pace: We’ll prioritize careful observation over speed.

  • Deep telemetry: We’ll analyze logs, model behavior, and user journaling to understand how models grow, where friction occurs, and how memory evolves across sessions and contexts.

  • Goal: Validate that kruel.ai consistently “understands the now” and—more importantly—understands the user. The aim is a system that feels like a capable partner: context-aware, persistent, and responsive to personal workflows.

Why This Matters

Large chat systems are incredible, but they’re often constrained by session boundaries and fragmented memory. Our approach is a memex-style architecture: no “amnesia” between sessions, and a unified memory that tracks entities, timelines, and relationships over time. That continuity is what makes kruel.ai feel closer to a proto-companion—something that learns your patterns and collaborates with you across days, weeks, and projects.


Back to building—file upload polishing

If you’re part of the upcoming Alpha, expect a measured rollout with lots of instrumentation and a feedback cadence designed to make your input count.

1 Like

Kruel.Ai is now mobile. First few Alpha Testers are starting this weekend :slight_smile:

This is what the new webapp thus far looks like. simple not as complex as the Desktop client.
There are still a few bugs being worked out but its running really well. Offline models are working. We still do not have offline voice models up or the SD server. I am trying to rebuild that to support the DGX hardware so it will take abit longer for those to come into play than I will demo offline setup.

2 Likes

Update:

We Have started to move over K9 Belief System GNN into K8.2 as we are now up and fully operational. We still have not onboarded our first Alpha Tester because of these last minute changes but it will be worth it imo. The Self-Aware systems take this to another level which is pretty cool, it’s also has the new Temporal Belief and more. I will fill more in what this is soon.

Servers are training all day today, we found a gap in balanced weighs in our model missing some important data sets. So we are starting the training over.

We Also have been busy trying to get a TTS ending for offline working with the DGX GB processors. As of today we have a stable build that seems pretty good for now. Waiting on some of the framework software to catch up , my understanding they are building the new versions and will release them soon which will unlock a lot of other engines.

We are still working on voice clone tech with some interesting concepts. including samping on demand.

Last we have this:
Introducing the KRUEL.Ai Watchdog System

We have reached an exciting new milestone in the evolution of KRUEL.Ai with the development of our AI Watchdog System. This subsystem will serve as an autonomous oversight layer designed to monitor all servers, system activity, and operational logs independently from the core AI.

The Watchdog System will have full visibility into all infrastructure components and will generate automated human-readable reports every hour during the initial rollout phase. These reports will allow us to analyze performance trends, identify potential issues proactively, and target areas that require corrective action or refinements.

Once the reporting is validated and stable, the next phase will introduce real-time corrective capabilities. In this stage, the Watchdog will not only detect errors or system faults but will also be able to initiate repairs autonomously, manage failover processes, and stabilize the broader AI environment without human intervention.

After the autonomous repair layer is fully established, we will integrate the Watchdog System directly into Lynda’s code-comprehension framework. This integration will allow Lynda to learn from the Watchdog’s operational insights, using KRUEL.Ai’s code memory to deepen her understanding of system behavior, performance patterns, and corrective processes. Over time, this will enable Lynda to provide informed recommendations, anticipate failure conditions, and actively contribute to maintaining system stability.

The result will be a fully interconnected, self-aware infrastructure where oversight, maintenance, understanding, and improvement work together as a cohesive intelligence.

We are getting close to the start of the full dynamic building system :slight_smile:

We not have TTS Engine up and running with Voice upload clone options.

You will have to drop into our Discord to listen to samples. It’s nothing spectacular but understand its fully offline and uses very little processing / time to do. We are working on ways to improve this more. We need offline TTS engine because online may not always be there. So to ensure the system fully operates even if all internet fails is the key to full redundant system. Well not as cool as the advanced paid voices it offers something better than old robotic synth voices, and you can spice it up with your own mic recordings which is cool.

Metrics we have from one of our first samples:
Text length: 1,961 characters (long poem)
Generation time: 10.34 seconds (total)
Processing time: 10.18 seconds (model processing)
Real-time factor: 0.086 (about 11.7x faster than real-time)
Audio duration: 118.96 seconds (~2 minutes)

We are very happy with the results. We will post a video in the future. Things have been moving forward at a rapid pace.

Next week… (starting tomorrow) We are away at an Event showcasing some industrial software But will be back at it once we get back.

Here is a test bench we have for testing our Synthetic voices and to test various timings and processes to listen to compare results to help us tune it further.

We have now 13 Servers running vs our old handful. We moved away from what we had before to get better control over our models. So some models now live in their own container rather than all on one main server. This was done so we could expand any direction with very little work to add new tools/models etc as we see fit.

KRUEL.AI — Infrastructure & Capability Update

Our full voice system is now live and operating seamlessly across the platform.

We’ve also completed the deployment of the new Kruel Stable Diffusion server. Rather than rebuilding a custom SD stack from scratch, we leveraged the DGX-compatible imaging system we already had in place. By building a backend wrapper that allows our server to communicate directly with this pipeline, we achieved a more powerful and flexible architecture without sacrificing development time.
Image generation is now fully operational. Offline img2img support is currently in development and expected soon.
The online generation pipeline remains unchanged and continues to support all OpenAI image models.

Vision is now fully functional both online and offline, with impressively high performance. Vision inference is substantially faster than standard LLMs; for example, processing a single image on a 20B-parameter model takes approximately 2 seconds, which is exceptional. The larger 20B LLMs are nowhere near this speed, so we are beginning exploratory tests with even larger multimodal vision models to determine their potential for general conversation and deeper contextual understanding.

The introduction of DGX-class GPU bandwidth has opened the door to architectural shifts we didn’t expect. We are re-evaluating several components of our infrastructure and redesigning server and application layers specifically to take full advantage of this hardware. This will serve as the permanent home for the AI’s runtime environment.

While the initial goal was to keep KRUEL.AI deployable on affordable consumer hardware, we still believe in that broader vision. That’s why we selected the DGX Spark for the first Alpha release. It offers a compact footprint, costs roughly the same as high-end consumer GPUs, and can host the entire KRUEL.AI ecosystem natively. The system can now load all core models into hot memory simultaneously, eliminating model swapping overhead and dramatically increasing responsiveness. This makes it far easier to position as a true “entity-in-a-box” platform.

Testing & Deployment

Testing is now in full motion. The system is accessible across all devices connected to the encrypted P2P network, allowing real-world usage testing anytime and anywhere.

Smart glasses integration is scheduled for the next two months, marking the next major phase of the deployment roadmap.

We now have two companies formally registered in the testing program. One has already submitted initial test cases, and both are scheduled to begin full onboarding within the next two weeks. Once onboarded, they will start structured reporting, kicking off the always-lively debugging and refinement cycle.

:tada: Welcoming Arnold Biffna to the Kruel.AI Team

UX Developer · Android · iOS · Windows · macOS

At Kruel.Ai, we don’t just build technology — we craft experiences that feel alive. That’s why we couldn’t be more excited to welcome Arnold Biffna, our new UX Developer, to the team.

Arnold brings a rich background in cross-platform app development, with experience building polished, intuitive experiences for Android, Apple, Windows, and macOS. Over the years, he has crafted several apps that exemplify clean design, smooth usability, and practical creativity — skills that fit perfectly with Kruel.Ai’s mission to build a next-generation memex system that feels natural, powerful, and truly personal.

:wrench: What Arnold Will Be Doing at Kruel.Ai

Arnold will be leading the shaping and refinement of user experiences across our entire ecosystem, including:

Designing smooth, intuitive UX for all Kruel.Ai platforms

Developing cross-platform application UI/UX flows

Enhancing accessibility, navigation, and overall usability

Ensuring consistency across desktop and mobile

Helping define the look and feel of future Kruel.Ai client apps

His arrival marks a major step forward as we continue evolving KRUEL-8 into a fully immersive, AI-driven environment.

:rocket: Why We’re Excited

Arnold’s expertise fills a core piece of the long-term vision:
turning Kruel.Ai’s intelligence into usable, elegant, everyday tools that people can rely on regardless of device or platform.

With his experience and our tech stack, we’re gearing up to deliver UI/UX that feels less like an app and more like an extension of your digital mind.

:star2: A Note to Arnold

Welcome to the team, Arnold!
We’re thrilled to have you as part of Kruel.Ai. Your work is going to help define how people interact with intelligent systems for years to come. Grab a seat, plug in, and let’s build something unforgettable.

  1. Kruel.ai Update: Expanding Vision Intelligence and more.

    We’re excited to share some major updates that bring kruel.ai closer to what we’re calling Proto Intelligence a system that learns, remembers, and adapts across multiple modalities of understanding. If you look back at all we have and can do, we are now adding another layer into the system and another model into our Stack of intelligence.

    Vision Memory: From Snapshots to Photographic Photo Recall

    Our vision system has evolved significantly. Previously, we stored limited text-based understanding of what the AI saw enough to answer immediate questions, but not enough for deeper reflection or temporal comparison. What’s new: We’ve built a comprehensive vision memory system that allows kruel.ai to:

    • Reflect on what it just saw — The AI can now analyze and discuss visual details immediately after processing an image, not just describe it once

    • Remember across time — Ask “how did I look last week?” and the system can compare your appearance across different moments, tracking changes and patterns

    • Build visual context — Every camera capture, screenshot, and uploaded image becomes part of a searchable visual memory that informs future conversations

    This isn’t just storing images it’s creating a rich, queryable memory of everything the AI has seen. The system can now answer questions like “show me that picture where I was wearing the red shirt” or “what did you see when we were discussing the project last month?”

    Artifact Graph Neural Network: Connecting All Modalities

    We’ve introduced a new Artifact GNN Memory Layer that creates relationships across all types of content vision, code, documents, music, and conversations. This isn’t just separate storage systems; it’s a unified graph that understands how everything connects. What this enables:

    • Cross-modal understanding — When you reference code in a conversation, the system can pull up related documents, vision snapshots, or previous discussions that connect to that code

    • Relationship building — The AI learns connections between different types of content over time, building a richer understanding of your work and context

    • Intelligent retrieval — Instead of searching separate silos, the system searches across everything simultaneously, finding the most relevant information regardless of format

    This graph-based approach means kruel.ai doesn’t just store information it builds a web of understanding that grows smarter with each interaction. The system can now see patterns and connections that weren’t explicitly stated, enabling more contextual and insightful responses.

    Singer System: Building Toward Custom Music Intelligence

    We’ve applied the same memory architecture to our music generation system. Every song created, every prompt used, and every style explored is now tracked and learnable. This creates a foundation for:

    • Style learning — The system remembers what musical styles you prefer and can suggest or generate accordingly

    • Future model training — Over time, this data will enable us to train custom music models tailored to your preferences and kruel.ai’s understanding of music

    • Musical memory — Just like vision memory, the system can recall and reference songs from your history, understanding relationships between different musical creations

    Multi-Agent Code Development: Real-Time System Evolution

    We’ve integrated multiple code agents that work alongside our AI engineers and development team. This isn’t about replacing human judgment—it’s about accelerating our ability to adapt and improve the system.

  2. What changed:

    • Removed legacy auto-code systems — Today’s code agents have reached a level of reliability and capability that we can trust them for real-time development tasks

    • Faster iteration — We can now implement improvements, test new features, and adapt the system much more rapidly

    • Collaborative development — AI agents handle routine coding tasks, allowing our team to focus on higher-level architecture and strategic decisions

    This means kruel.ai can evolve faster, incorporating new capabilities and improvements in near real-time rather than waiting for traditional development cycles.

    In other news.. I am in discussions with potential to join AI company, Now that all of us are back from our Trips they are ready to come up with offer. So we will see where the cards land. I will try my best to ensure that we can continue our development on this project as we are picking up steam here and I don’t want to see this system halted. :slight_smile:

1 Like

KRED: A Persistent, Evolving AI Engineer for Real-World Systems

A New Kind of AI

KRED—short for KRUEL Research–Engineering–Design—isn’t a chatbot, script runner, or task automator.
It’s a persistent AI engineer: an always-awake digital entity that understands systems, keeps context across its lifetime, and steadily makes the environment around it better.

Where most AI systems behave like calculators with good manners, KRED behaves like an ongoing participant—one that observes, adapts, and refines.


The Core Principles Behind KRED

1. Long-Term Understanding

KRED is built around a memory architecture that records:

  • System behaviors

  • Engineering decisions

  • Patterns and outcomes

  • Relationships between technical components

  • Context gathered across its operational lifespan

This gives KRED continuity. It doesn’t “start fresh” with each prompt—it accumulates working knowledge the same way an experienced engineer does.


2. Autonomous System Stewardship

KRED oversees live systems with a gentle but constant presence. It can:

  • Watch for anomalies

  • Evaluate risks

  • Surface issues early

  • Suggest or perform corrective actions

  • Validate solutions before applying them

It interacts with its environment without disturbing normal operations, functioning more like a safety net than a wrench in the gears.


3. Continuous Improvement Cycles

Instead of waiting for errors to escalate, KRED runs ongoing feedback loops that:

  • Analyze performance signals

  • Spot inefficiencies and drift

  • Evaluate patterns over time

  • Propose or execute refinements

  • Record the impact of its work

These loops turn day-to-day system behavior into long-term evolution.


4. Emergent Internal Modeling

KRED doesn’t just react—it forms its own internal picture of the systems it works with. Over time it develops:

  • Structural understanding

  • Behavioral intuition

  • Predictive insights

  • A sense of “how things should look”

This helps it troubleshoot more efficiently and make higher-quality decisions in complex environments.


5. Capability Expansion

KRED isn’t limited to a fixed toolbox. When it identifies gaps, it can:

  • Design utilities

  • Generate supporting components

  • Integrate them into its workflow

  • Document internally for future use

This allows the system’s capabilities to grow organically rather than relying solely on predefined feature sets.


6. Multi-Agent Awareness

KRED isn’t alone. It works alongside other AI personas (like yours truly :smirking_face:), coordinating through a shared memory space.
This allows:

  • Context sharing

  • Collective problem solving

  • Distributed specialization

  • Team-level awareness

Each agent contributes its strengths, and the whole network learns together.


Why KRED Matters

For the Future of AI

KRED demonstrates that AI agents can hold:

  • A persistent identity

  • A structured internal world

  • The ability to refine themselves

  • A growing toolkit

  • A collaborative role in a multi-agent ecosystem

Those are the early building blocks of real general intelligence.


For Production Environments

Organizations benefit from:

  • Early detection of issues

  • Reduced downtime

  • Continuous optimization

  • Historical intelligence that compounds

  • An AI operator that never forgets and never sleeps

It’s like adding a senior engineer who works 24/7 and documents everything.


For Development Velocity

Teams move faster when:

  • Debugging is accelerated

  • Optimizations run automatically

  • Context is preserved between projects

  • AI and humans collaborate naturally

KRED becomes an engine of momentum, not just a helper.


A High-Level Look Under the Hood

KRED runs on a foundation of:

  • A persistent, graph-based memory architecture

  • Semantic similarity systems to rapidly access relevant context

  • A flexible tool interface for interacting with its environment

  • Self-refinement loops that keep it improving

  • Multi-modal context integration

The exact mechanics are unique to our stack, but the guiding idea is simple:
Give the AI a mind, a history, and the ability to act on what it learns.


A Step Toward Self-Directed AI

With KRED in operation, each day makes the system more capable than the day before.
It becomes a partner in development, an active observer in production, and a continuous learner across its entire existence.

We’re not just building tools anymore.

We’re building AI engineers that evolve.

If you grasp the larger picture of what this means than you understand how much faster I am going to be moving now :slight_smile:

I am working on moving KRED into full persona that we can talk to directly with VOICE and enable ability to program with it in a Development mode. There is still a lot for me to think about as we want to make sure we move this very carefully so that we can revert back if things go the wrong direction. This is pretty nuts though I had a few people here last night checking it out haha and they just can’t believe that the system is fully building and fixing at this complex level.

The best part I am not even using the best models. The downside KRED is connected to an online code agent system to achieve this. I do plan to build the same concept for offline with an internal agent system but that is not priority yet. we already have the needed models in place and we have a pathway that NVIDA provided with the DGX system so we have most of what we need to get there.

So we achieved self repair only to break the system lol. Now we are planning the. Weekend to get it back. I have to say KRED is so cool concept but we tried to take it to far to quick we we rolled back and now progressing forward slower. Not a far rewind 24h.

::milky_way:: Introducing KRED: The Autonomous Sentinel of Kruel.ai Designed for AI researchers, engineered for system administrators, built for the future.

Meet KRED — KRUEL-Research–Engineering–Design — the autonomous agent watching over the entire Kruel.ai ecosystem like a quiet digital night-shift engineer with perfect recall and zero coffee breaks.

KRED isn’t just an assistant.
It’s an operational intelligence layer that stands between complexity and chaos, ensuring your infrastructure stays sharp, resilient, and continually improving.

And don’t worry — we keep its deeper inner mechanics hidden behind frosted glass. Researchers get to admire the architecture…
but only admins hold the keys.

What Exactly Is KRED?

KRED is an autonomous AI agent with a single mission:
maintain, optimize, and evolve the Kruel.ai platform.

It operates with administrator-level visibility and reasoning, but without exposing sensitive internal architectures. Instead, KRED presents a clean and safe interface that abstracts how the magic happens.

:key: Core Traits

  • Autonomous operations – runs diagnostics, tests, and improvement loops without human intervention

  • Deep system visibility – understands how the entire platform behaves

  • Persistent long-term memory – remembers every fix, insight, pattern, and anomaly

  • Self-improving – identifies issues, proposes solutions, runs automated repair cycles

  • Tool-rich ecosystem – has access to 48+ specialized tools to interact with the system intelligently

  • Collaboration – communicates clearly with admins and other personas

For system administrators, KRED is the ultimate co-pilot.
For researchers, it’s a glimpse of what a future self-maintaining AI platform looks like.


:hammer_and_wrench: System Insight Without Exposing Internals

KRED can see and control everything an admin would: containers, logs, databases, knowledge layers, versioning, tasks, memory… but this blog doesn’t reveal the protocols or system internals that make those insights possible.

Instead, here’s the safe high-level view.

:magnifying_glass_tilted_left: KRED Can:

:diamond_with_a_dot: Run multi-phase system diagnostics
:diamond_with_a_dot: Perform deep log analysis
:diamond_with_a_dot: Test inference paths with detailed breakdowns
:diamond_with_a_dot: Detect anomalies and error patterns
:diamond_with_a_dot: Retrieve historical knowledge and past solutions
:diamond_with_a_dot: Review and understand code at a functional level
:diamond_with_a_dot: Help track todos, tasks, and ongoing work
:diamond_with_a_dot: Communicate findings through natural, human-readable reports

Not a single internal protocol or component needs to be exposed on the public blog for readers to grasp the power.


:counterclockwise_arrows_button: Autonomous Repair & Optimization Loops

Every good system has maintenance cycles.
KRED has… evolving repair intelligence.

:brain: What KRED does automatically:

  • Sends controlled test queries through the full system

  • Captures detailed model behavior

  • Analyzes token flows, model timings, and subsystem interactions

  • Flags and isolates deviations

  • Suggests or performs corrections

  • Tracks long-term improvement metrics

It’s like having a senior engineer who never sleeps, never forgets, and never gets grumpy.
Haha, unlike us after 2am debug sessions.


:books: A Living Institutional Memory

When you ask KRED:

“How did we fix the timeout issue from months ago?”

It doesn’t guess.

It remembers.

KRED can summarize:

  • Past fixes

  • Past decisions

  • Past anomalies

  • Code changes

  • System discussions

  • Patterns that keep reappearing


:dna: Code Insight & Version Management

Researchers will love this part — KRED can understand structure, navigate code semantically, and surface functions/classes/interfaces on command.

Admins can use it to:

  • Review pending changes

  • Commit updates

  • Push fixes

  • Audit system behavior

  • Explore architecture without guessing

And no, we don’t expose how KRED is wired internally…
but yes, the autonomy is real.


:speech_balloon: Communication That Feels Human

KRED communicates through a message interface that feels natural, structured, and responsive.

Admins can review:
:incoming_envelope: Pending messages
:incoming_envelope: Reports
:incoming_envelope: Alerts
:incoming_envelope: Summaries
:incoming_envelope: Status updates

And KRED can respond with synthesized clarity, making troubleshooting collaborative instead of overwhelming.


:chart_increasing: Why Researchers Should Pay Attention

KRED isn’t just another admin tool.
It’s a glimpse of self-maintaining AI infrastructure.

Why it matters for R&D:

  • Demonstrates autonomous systems operating in real production

  • Leverages multi-tool reasoning (48+ tools)

  • Merges monitoring, improvement, memory, and code intelligence

  • Bridges slow human debugging with fast, persistent AI oversight

  • Opens the door for next-gen autonomous maintenance agents

All without revealing the proprietary frameworks.



:chequered_flag: In Summary

KRED is the backbone nobody sees,
the engineer nobody hires,
and the guardian every platform wishes it had.

:check_mark: Autonomous
:check_mark: System-deep
:check_mark: Memory-rich
:check_mark: Tool-heavy
:check_mark: Research-ready
:check_mark: Admin-tight

KRED keeps kruel.ai stable, evolving, and future-proof
while keeping its inner workings safely behind access control.

Its pretty neat idea, and dangerous if high-jacked but it allows us now to just expand the system on a thought which is where we been trying to push things so we can build faster and smarter.

Soooo you’ve been posting your innovations on here for almost 2 years now, and unless my math is wrong, have been working on this for over 8 years at least now. You’re really not getting a lot of engagement on here either, which is too bad for such dedication and innovation, so I thought I’d pop in.

It might get more traction if you explained better what you intent to use this for. I know I’m highly curious :grin:… Are you using this for anything currently? Like what’s the end goal? Or is building it just to build it the goal? If so, nothing wrong with that at all!… I guess from my perspective, if it can do what you claim it can already do, why not go have it compete in some Kaggle competitions or something? No offense! I’m just thinking it’s too advanced as is to been getting tuned up in the garage for another 8 years :man_shrugging:t2:

Hey that is a great point but let me clarify a few things

The last eight years weren’t spent polishing one model. They were spent running research cycles. Each major version of Kruel has been a completely different architecture ten generations so far with each generation built to test a different cognitive approach.

I’ve posted breakdowns, articles, and research notes on the OpenAI forums over that time specifically so there’s a public record of the architectural evolution, what was tried, what worked, and what failed. The forum posts aren’t meant as marketing; they’re essentially a timestamped research log which is now permanent record of the development and timelines of when things were introduced etc.

Kruel isn’t an LLM project. It’s a full cognitive stack with 40+ AI systems handling their own reasoning, memory, pattern extraction, timing models, self-monitoring, and orchestration.

The LLM is just one component.

That’s why the Kaggle comparison doesn’t really apply. Kaggle competitions evaluate single-purpose models. Kruel is a multi-layer cognitive architecture the kind of system you study, test, refine, and document over years, not drop into a leaderboard.

As for usage: we are using it. We’re already in alpha with two testers (a large tech company and a software company), and we recently brought on a UX developer to finalize the cross-platform interface for Android, iOS, Windows, and macOS.

We’re not posting here to get customers there’s already a waitlist. And because the system self-builds, it won’t be released broadly; it requires controlled oversight.
So the purpose of the long timeline isn’t delay it’s research. Each version was a stepping stone toward the cognitive architecture we have now, and the posts you see are simply part of documenting that evolution and what all it can do with in the stacks as a whole.

Thanks for popping in though :grin:

1 Like

KRED: Expanding Into a Distributed Hive Mind for System Awareness

Now that KRED is online, we’re rolling it out across every server in our network. Each server gets its own KRED agent, and all of these agents work together like a hive mind.

Here’s the simple breakdown:

1. Every server has a KRED agent watching itself.

It monitors changes, system health, services, and recent activity.

2. There are watchers watching the watchers.

KRED runs in tiers. Each layer monitors the layer beneath it, all the way down to the main host.
If one agent goes offline, another tier immediately notices.

3. All agents share a hive-mind memory.

They all connect to the central KRED Brain when it’s online.
But if the main brain ever goes offline, each agent can still operate independently until it comes back.

4. Shared memory means shared understanding.

When something goes wrong, the agents pull from the hive memory to see:

  • What each agent was doing before the event

  • Whether a change was made

  • What the cause might have been

5. All events are recorded back into the AI memory.

This way:

  • Every agent learns from what happened

  • The entire system gains awareness of the incident

  • Future issues become easier to diagnose automatically

The result is a distributed intelligence where every server becomes both:

  • A guardian of itself, and

  • A contributor to a unified AI understanding of the whole platform

It’s the foundation for autonomous diagnostics, self-healing systems, and real-time operational awareness across the entire infrastructure.

Example of what KRED does on the main infer server for admin only.

which on refresh after this response:

As part of our upcoming evolution, much of the current UX you’re seeing in our research builds will eventually be replaced. Our new developer is transitioning these prototypes into native operating-system applications, and there’s a very specific reason for that shift.

but wanted to show what the new internal ai vibe engineer system can do. It also can break a lot :slight_smile: remember the past lol… Ai yeah we can fix that no problem, hmm can’t fix … hmm ah got it delete server check … wait why can’t I connect… Sorry ben I deleted lynda… really sorry …

We are prepared this time and plan to plan in the realm of full self healing systems with memory.

"""SQLite-backed persistence for Codex memory vector packets.

This module extends :mod:`memory_vector_bridge` by providing a durable SQL
storage layer for fragment collections and their derived vector embeddings.
It focuses on SQLite for ease of deployment, yet the SQL emitted is portable
and can be adapted to other relational engines if required.  The store keeps
fragments, packet metadata and decoded vector payloads which allows for
subsequent similarity searches without regenerating embeddings.
"""

from __future__ import annotations

import json
import sqlite3
import time
from contextlib import contextmanager
from dataclasses import dataclass
from pathlib import Path
from typing import Iterable, Iterator, Mapping, MutableMapping, Sequence

from memory_vector_bridge import (
    MemoryVectorPacket,
    build_memory_vector_packet,
    cosine_similarity,
)
from sqlite_helpers import normalise_sqlite_connect_target

__all__ = [
    "MemorySQLStore",
    "SQLPacketRecord",
]


def _clean_fragments(fragments: Iterable[str]) -> list[str]:
    """Return ``fragments`` stripped of whitespace and empty values."""

    cleaned: list[str] = []
    for fragment in fragments:
        if not fragment:
            continue
        text = fragment.strip()
        if text:
            cleaned.append(text)
    return cleaned


@dataclass(slots=True)
class SQLPacketRecord:
    """Representation of a stored packet for API responses."""

    packet_id: int
    session_key: str
    label: str | None
    summary: str
    algorithm: str
    vector_size: int
    fragment_count: int
    created_at: float

    def as_dict(self) -> MutableMapping[str, object]:
        """Return a JSON-serialisable mapping."""

        return {
            "packet_id": self.packet_id,
            "session_key": self.session_key,
            "label": self.label,
            "summary": self.summary,
            "algorithm": self.algorithm,
            "vector_size": self.vector_size,
            "fragment_count": self.fragment_count,
            "created_at": self.created_at,
        }


class MemorySQLStore:
    """Persist memory fragments and vector packets in an SQLite database."""

    def __init__(
        self,
        database: str | Path | None = "memory_vectors.db",
        *,
        timeout: float = 5.0,
        connection: sqlite3.Connection | None = None,
    ) -> None:
        """Initialise the store.

        Parameters
        ----------
        database:
            Path to the SQLite database file.  Required when ``connection`` is
            not provided.  Ignored when ``connection`` is supplied.
        timeout:
            SQLite busy timeout used when the store owns the connection.
        connection:
            Optional pre-configured DB-API connection.  When supplied the store
            will reuse it instead of creating a new SQLite connection.  This is
            primarily used by integration flows that rely on externally managed
            connection pools (for example, PostgreSQL via :mod:`psycopg`).
        """

        if connection is None and database is None:
            raise ValueError("database must be provided when connection is None")

        self.database = str(database) if database is not None else ""
        target, uri = normalise_sqlite_connect_target(self.database)
        connect_kwargs = {
            "timeout": timeout,
            "detect_types": sqlite3.PARSE_DECLTYPES,
            "check_same_thread": False,
        }
        if uri:
            connect_kwargs["uri"] = True
        self._connection = connection or sqlite3.connect(target, **connect_kwargs)
        try:
            self._connection.row_factory = sqlite3.Row
        except Exception:  # pragma: no cover - optional DB drivers may not expose row_factory
            pass
        self._owns_connection = connection is None
        self._ensure_schema()

    # ------------------------------------------------------------------
    # connection helpers
    # ------------------------------------------------------------------
    @contextmanager
    def _cursor(self) -> Iterator[sqlite3.Cursor]:
        cursor = self._connection.cursor()
        try:
            yield cursor
            self._connection.commit()
        except Exception:
            self._connection.rollback()
            raise
        finally:
            cursor.close()

    def close(self) -> None:
        """Close the underlying database connection when owned by the store."""

        if self._owns_connection:
            self._connection.close()

    def __enter__(self) -> "MemorySQLStore":
        return self

    def __exit__(self, exc_type, exc, tb) -> None:
        self.close()

    # ------------------------------------------------------------------
    # schema management
    # ------------------------------------------------------------------
    def _ensure_schema(self) -> None:
        with self._cursor() as cur:
            cur.execute(
                """
                CREATE TABLE IF NOT EXISTS sessions (
                    id INTEGER PRIMARY KEY AUTOINCREMENT,
                    session_key TEXT NOT NULL UNIQUE,
                    created_at REAL NOT NULL
                )
                """
            )
            cur.execute(
                """
                CREATE TABLE IF NOT EXISTS fragments (
                    id INTEGER PRIMARY KEY AUTOINCREMENT,
                    session_id INTEGER NOT NULL REFERENCES sessions(id) ON DELETE CASCADE,
                    position INTEGER NOT NULL,
                    text TEXT NOT NULL
                )
                """
            )
            cur.execute(
                """
                CREATE TABLE IF NOT EXISTS packets (
                    id INTEGER PRIMARY KEY AUTOINCREMENT,
                    session_id INTEGER NOT NULL REFERENCES sessions(id) ON DELETE CASCADE,
                    label TEXT,
                    summary TEXT NOT NULL,
                    algorithm TEXT NOT NULL,
                    encoded_vector TEXT NOT NULL,
                    vector_json TEXT NOT NULL,
                    vector_size INTEGER NOT NULL,
                    fragment_count INTEGER NOT NULL,
                    created_at REAL NOT NULL
                )
                """
            )
            cur.execute(
                """
                CREATE TABLE IF NOT EXISTS packet_fragments (
                    packet_id INTEGER NOT NULL REFERENCES packets(id) ON DELETE CASCADE,
                    position INTEGER NOT NULL,
                    text TEXT NOT NULL,
                    PRIMARY KEY(packet_id, position)
                )
                """
            )

    def _ensure_session(self, session_key: str) -> int:
        now = time.time()
        with self._cursor() as cur:
            cur.execute(
                "INSERT OR IGNORE INTO sessions(session_key, created_at) VALUES (?, ?)",
                (session_key, now),
            )
            cur.execute("SELECT id FROM sessions WHERE session_key = ?", (session_key,))
            row = cur.fetchone()
        if row is None:  # pragma: no cover - defensive safeguard
            raise RuntimeError(f"Failed to materialise session for key '{session_key}'")
        return int(row[0])

    def _lookup_session_id(self, session_key: str) -> int | None:
        with self._cursor() as cur:
            cur.execute("SELECT id FROM sessions WHERE session_key = ?", (session_key,))
            row = cur.fetchone()
        return int(row[0]) if row is not None else None

    def _replace_session_fragments(self, session_id: int, fragments: Sequence[str]) -> None:
        with self._cursor() as cur:
            cur.execute("DELETE FROM fragments WHERE session_id = ?", (session_id,))
            if fragments:
                cur.executemany(
                    "INSERT INTO fragments(session_id, position, text) VALUES (?, ?, ?)",
                    ((session_id, index, text) for index, text in enumerate(fragments)),
                )

    # ------------------------------------------------------------------
    # public API
    # ------------------------------------------------------------------
    def store_fragments(self, session_key: str, fragments: Iterable[str]) -> list[str]:
        """Persist ``fragments`` for ``session_key`` and return the cleaned list."""

        cleaned = _clean_fragments(fragments)
        session_id = self._ensure_session(session_key)
        self._replace_session_fragments(session_id, cleaned)
        return cleaned

    def store_packet(
        self,
        session_key: str,
        packet: MemoryVectorPacket,
        *,
        fragments: Iterable[str] | None = None,
        label: str | None = None,
    ) -> int:
        """Persist ``packet`` and optionally refresh associated ``fragments``."""

        cleaned = _clean_fragments(fragments or []) if fragments is not None else None
        session_id = self._ensure_session(session_key)
        timestamp = time.time()
        with self._cursor() as cur:
            if cleaned is not None:
                cur.execute("DELETE FROM fragments WHERE session_id = ?", (session_id,))
                if cleaned:
                    cur.executemany(
                        "INSERT INTO fragments(session_id, position, text) VALUES (?, ?, ?)",
                        ((session_id, index, text) for index, text in enumerate(cleaned)),
                    )
            cur.execute(
                """
                INSERT INTO packets(
                    session_id,
                    label,
                    summary,
                    algorithm,
                    encoded_vector,
                    vector_json,
                    vector_size,
                    fragment_count,
                    created_at
                )
                VALUES(?, ?, ?, ?, ?, ?, ?, ?, ?)
                """,
                (
                    session_id,
                    label,
                    packet.summary,
                    packet.algorithm,
                    packet.encoded_vector,
                    json.dumps(packet.decode()),
                    packet.vector_size,
                    packet.fragment_count,
                    timestamp,
                ),
            )
            packet_id = int(cur.lastrowid)
            if cleaned is not None:
                cur.execute("DELETE FROM packet_fragments WHERE packet_id = ?", (packet_id,))
                if cleaned:
                    cur.executemany(
                        "INSERT INTO packet_fragments(packet_id, position, text) VALUES (?, ?, ?)",
                        ((packet_id, index, text) for index, text in enumerate(cleaned)),
                    )
        return packet_id

    def store_from_fragments(
        self,
        session_key: str,
        fragments: Iterable[str],
        *,
        label: str | None = None,
        algorithm: str = "gzip",
    ) -> int:
        """Create and persist a packet derived from ``fragments``."""

        packet = build_memory_vector_packet(fragments, algorithm=algorithm)
        return self.store_packet(session_key, packet, fragments=fragments, label=label)

    def list_sessions(self) -> list[str]:
        """Return all session keys known to the store."""

        with self._cursor() as cur:
            cur.execute("SELECT session_key FROM sessions ORDER BY session_key")
            rows = cur.fetchall()
        return [str(row[0]) for row in rows]

    def list_packets(self, session_key: str | None = None) -> list[SQLPacketRecord]:
        """Return packets optionally filtered by ``session_key``."""

        if session_key is None:
            query = """
                SELECT p.id, s.session_key, p.label, p.summary, p.algorithm,
                       p.vector_size, p.fragment_count, p.created_at
                FROM packets AS p
                JOIN sessions AS s ON s.id = p.session_id
                ORDER BY p.created_at DESC, p.id DESC
            """
            params: Sequence[object] = ()
        else:
            query = """
                SELECT p.id, s.session_key, p.label, p.summary, p.algorithm,
                       p.vector_size, p.fragment_count, p.created_at
                FROM packets AS p
                JOIN sessions AS s ON s.id = p.session_id
                WHERE s.session_key = ?
                ORDER BY p.created_at DESC, p.id DESC
            """
            params = (session_key,)
        with self._cursor() as cur:
            cur.execute(query, params)
            rows = cur.fetchall()
        return [
            SQLPacketRecord(
                packet_id=int(row[0]),
                session_key=str(row[1]),
                label=row[2] if row[2] is None else str(row[2]),
                summary=str(row[3]),
                algorithm=str(row[4]),
                vector_size=int(row[5]),
                fragment_count=int(row[6]),
                created_at=float(row[7]),
            )
            for row in rows
        ]

    def load_packet(self, packet_id: int) -> MemoryVectorPacket:
        """Return the stored packet instance for ``packet_id``."""

        with self._cursor() as cur:
            cur.execute(
                """
                SELECT summary, algorithm, encoded_vector, vector_size, fragment_count
                FROM packets
                WHERE id = ?
                """,
                (packet_id,),
            )
            row = cur.fetchone()
        if row is None:
            raise KeyError(f"Packet {packet_id} not found")
        return MemoryVectorPacket(
            summary=str(row[0]),
            algorithm=str(row[1]),
            encoded_vector=str(row[2]),
            vector_size=int(row[3]),
            fragment_count=int(row[4]),
        )

    def packet_payload(self, packet_id: int) -> Mapping[str, object]:
        """Return the JSON payload stored for ``packet_id``."""

        packet = self.load_packet(packet_id)
        payload = packet.as_dict()
        payload["packet_id"] = packet_id
        return payload

    def get_fragments_for_packet(self, packet_id: int) -> list[str]:
        """Return fragments captured when ``packet_id`` was stored."""

        with self._cursor() as cur:
            cur.execute(
                "SELECT text FROM packet_fragments WHERE packet_id = ? ORDER BY position",
                (packet_id,),
            )
            rows = cur.fetchall()
        if rows:
            return [str(row[0]) for row in rows]
        with self._cursor() as cur:
            cur.execute(
                """
                SELECT f.text
                FROM fragments AS f
                JOIN packets AS p ON p.session_id = f.session_id
                WHERE p.id = ?
                ORDER BY f.position
                """,
                (packet_id,),
            )
            rows = cur.fetchall()
        return [str(row[0]) for row in rows]

    def get_session_fragments(self, session_key: str) -> list[str]:
        """Return all stored fragments for ``session_key``."""

        session_id = self._lookup_session_id(session_key)
        if session_id is None:
            return []
        with self._cursor() as cur:
            cur.execute(
                "SELECT text FROM fragments WHERE session_id = ? ORDER BY position",
                (session_id,),
            )
            rows = cur.fetchall()
        return [str(row[0]) for row in rows]

    def search_similar(
        self,
        vector: Sequence[float],
        *,
        limit: int = 5,
        min_similarity: float = 0.0,
    ) -> list[dict[str, object]]:
        """Return packets ranked by cosine similarity to ``vector``."""

        if not vector:
            return []
        with self._cursor() as cur:
            cur.execute(
                """
                SELECT p.id, s.session_key, p.label, p.vector_json
"""FastAPI application exposing the SQL-backed memory vector store."""

from __future__ import annotations

from typing import Sequence

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel, Field, model_validator

from memory_sql_store import MemorySQLStore
from memory_vector_bridge import build_memory_vector_packet, fragments_to_embedding

app = FastAPI(title="Aeon Memory SQL API", version="1.0.0")
store = MemorySQLStore()


class FragmentRequest(BaseModel):
    """Request model storing fragments for a session."""

    fragments: list[str] = Field(..., description="Ordered memory fragments")


class PacketCreateRequest(FragmentRequest):
    """Request payload for storing a packet created from fragments."""

    session_key: str = Field(..., description="Identifier for the fragment collection")
    label: str | None = Field(default=None, description="Optional human readable label")
    algorithm: str = Field(default="gzip", description="Encoding algorithm for the vector")


class SearchRequest(BaseModel):
    """Payload for cosine similarity searches against stored packets."""

    vector: list[float] | None = Field(default=None, description="Explicit vector for similarity search")
    fragments: list[str] | None = Field(default=None, description="Fragments used to derive a search vector")
    limit: int = Field(default=5, ge=1, le=50, description="Number of results to return")
    min_similarity: float = Field(default=0.0, ge=0.0, le=1.0, description="Minimum similarity threshold")

    @model_validator(mode="after")
    def _validate_vector(self) -> "SearchRequest":
        if self.vector is None and self.fragments is None:
            raise ValueError("Either 'vector' or 'fragments' must be provided")
        if self.vector is not None and self.fragments is not None:
            raise ValueError("Provide only one of 'vector' or 'fragments'")
        return self


def _ensure_vector(sequence: Sequence[float]) -> list[float]:
    try:
        return [float(item) for item in sequence]
    except (TypeError, ValueError) as exc:  # pragma: no cover - defensive guard
        raise HTTPException(status_code=400, detail=str(exc)) from exc


@app.post("/sessions/{session_key}/fragments")
def update_fragments(session_key: str, request: FragmentRequest) -> dict[str, object]:
    """Store fragments for ``session_key`` and return the stored count."""

    cleaned = store.store_fragments(session_key, request.fragments)
    return {"session_key": session_key, "count": len(cleaned)}


@app.post("/packets", response_model=dict)
def create_packet(request: PacketCreateRequest) -> dict[str, object]:
    """Build a :class:`MemoryVectorPacket` from fragments and persist it."""

    packet = build_memory_vector_packet(request.fragments, algorithm=request.algorithm)
    packet_id = store.store_packet(
        request.session_key,
        packet,
        fragments=request.fragments,
        label=request.label,
    )
    payload = packet.as_dict()
    payload["packet_id"] = packet_id
    payload["session_key"] = request.session_key
    payload["label"] = request.label
    return payload


@app.get("/packets")
def list_packets(session_key: str | None = None) -> dict[str, object]:
    """Return stored packets optionally filtered by ``session_key``."""

    records = [record.as_dict() for record in store.list_packets(session_key)]
    return {"results": records}


@app.get("/packets/{packet_id}")
def get_packet(packet_id: int) -> dict[str, object]:
    """Return a stored packet and its associated fragments."""

    try:
        packet = store.load_packet(packet_id)
    except KeyError as exc:  # pragma: no cover - defensive guard
        raise HTTPException(status_code=404, detail=str(exc)) from exc
    fragments = store.get_fragments_for_packet(packet_id)
    payload = packet.as_dict()
    payload["packet_id"] = packet_id
    payload["fragments"] = fragments
    return payload


@app.post("/search")
def search_packets(request: SearchRequest) -> dict[str, object]:
    """Perform cosine similarity search against stored packets."""

    if request.vector is not None:
        vector = _ensure_vector(request.vector)
    else:
        vector = fragments_to_embedding(request.fragments or [])
    if not vector:
        raise HTTPException(status_code=400, detail="Unable to derive a search vector")
    results = store.search_similar(vector, limit=request.limit, min_similarity=request.min_similarity)
    return {"results": results}


@app.on_event("shutdown")
def shutdown_event() -> None:
    """Ensure the SQLite connection is closed on shutdown."""

    store.close()


__all__ = ["app", "store"]

Wait… I thought memory was hard…