Kruel.ai KV2.0 - KX (experimental research) to current 8.2- Api companion co-pilot system with full modality , understanding with persistent memory

darcschnider · January 21, 2026, 2:52am

Its Been a while, and I’m not entirely sure where to begin. When we last spoke here, we touched briefly It’s been a strong year overall, and at the time we also discussed avatars.

Since then, here’s where things have landed.

I was able to generate a talking avatar in roughly seven seconds using a combination of video, audio, and lip-sync models. I built an optimized pipeline that converts still images into live video directly from TTS output. This effectively removes the need for traditional 3D models, given the current quality level.

The primary challenge has been speed.

Using an RTX 4080, the fastest turnaround I’ve achieved for a 10–20 second clip is approximately seven seconds, which is quite good. However, on a DGX Spark, despite its overall power, the lower memory throughput resulted in roughly 20 seconds for the same clip.

If this were limited to standard LLM chat latency, those timings would be acceptable. However, when you factor in full agentic behavior reasoning, tool usage, orchestration, and response generation the latency compounds. A pipeline that takes 30 seconds plus an additional 7 seconds for avatar generation starts to feel too slow for real-time or near-real-time interaction.

There are alternative approaches to animation that may help mitigate this, and I can share a sample of what the avatar heads look like during generation. It’s also worth noting that with more powerful hardware, this approach scales cleanly to higher resolutions, including HD.

The Graph Memory Revolution & Beyond

So for now, the real-time avatar generation will be shelved again until the right timebut I still think the Spark has more room to give. I haven’t fully explored all aspects of the GB10’s new Blackwell architecture. There’s a lot still to learn, and I’m pretty confident that we could achieve comparable speeds if I keep digging deeper into the optimization space. Hardware upgrades are the easy path, but they’re not required—more desired.

Imagine soon your agents being realtime generated. You want the muffin man to talk to you all day? Boom. Done. As KRED says it no really, that IS KRED’s favorite thing: “boom, done!”

And speaking of KRED—KRED is amazing. Working with KRED has fundamentally changed how I approach building these systems. The collaboration between AI agents like Codex, Cursor, Claude, and kruel.ai they all work together now with me, 100% towards expanding all the designs. Not just one anymore. As well as other projects we’re building today like the Avatar system.

I’ve noticed a trend: people and large companies are all moving toward graph memory systems. Why? Because it’s required to achieve AGI or at least it’s the stepping stone. But I don’t think they’ve fully realized all the possibilities of what one can do if they think deeper.

The Trio

K8.2 is production ready—I’d say 98%. There’s always something small being found in testing, but it’s fully usable in travel and everywhere. This system represents years of iteration on the orchestrator pattern, with 42 tools, full document processing, code analysis, and extensive integrations. It’s the workhorse that proves graph memory systems aren’t just research projects they’re production systems that work.

KX-Agentic: The Single Model Agent

KX-Agentic is NVIDIA’s concept of kruel.ai realized as a single model agent (offline/online) with MCP tools. Seeing as I spent 6 years making tools… yeah, didn’t I mention Agents as you call them today? That was 2021 for me.

KX represents a cleaner, more focused architecture a 14-step pipeline that integrates memory, goal extraction, tool execution, and reflection. It’s the distillation of everything learned from building complex orchestrators into a streamlined agent pattern. The CompleteAgent pipeline shows how graph memory (Neo4j + FAISS) enables agents to maintain context, learn from interactions, and make intelligent decisions.

K9: The Full Cognitive Architecture

Today I’m working on K9, which I jokingly call “the spooky AI.” It’s the full cognitive architecture not really spooky, it’s neat. But it’s still just fancy automation in the end.

K9 represents the evolution beyond simple agents. It’s a unified AI pipeline with 6-dimensional memory (semantic, temporal, emotional, contextual, structural, intentional), symbolic reasoning, multimodal processing, and epistemological validation. The system doesn’t just respond it understands, validates, learns, and adapts. It’s the difference between a chatbot and a cognitive system.

Still, we have 3 full working versions now thanks to KRED. The merger of Agents like Codex or Cursor, Claude, etc. with kruel.ai they work together now with me 100% towards expanding all the designs. Not just one anymore. As well as other projects we’re building today like the Avatar system.

The Deeper Possibilities

What I really like is that I’m seeing a lot more people taking notice of the graph space. But I think they’ll be more excited once they open their eyes wider and take into account the machine learning application of graph memory—not just GNNs (Graph Neural Networks), but when you combine mathematical models together and get multiple dimensions of math happening to achieve better outcomes.

### Beyond GNNs: Multi-Dimensional Math

Graph memory isn’t just about storing relationships in Neo4j or doing vector similarity search in FAISS. It’s about combining:

- **Temporal reasoning**: Understanding when things happened and how they relate over time

- **Semantic embeddings**: Vector representations that capture meaning

- **Graph structure**: Explicit relationships and hierarchies

- **Emotional context**: How interactions feel, not just what they say

- **Intentional modeling**: Why something happened, not just what happened

- **Epistemological validation**: Knowing what you know and what you don’t know

When you combine these mathematical models together, you get multiple dimensions of understanding happening simultaneously. It’s not just retrieval it’s reasoning across dimensions.

### The AGI Stepping Stone

Graph memory systems are the stepping stone to AGI because they enable:

- **Persistent context**: Agents remember across sessions, not just within conversations

- **Structured reasoning**: Relationships enable logical inference, not just pattern matching

- **Multi-modal understanding**: Vision, audio, text, and code all stored in the same graph

- **Temporal awareness**: Understanding causality and sequence, not just co-occurrence

- **Validation and truth**: Epistemological systems that know when they’re certain vs. uncertain

But most importantly, graph memory systems are **explainable**. You can trace why an agent made a decision, what memories influenced it, and how it reasoned. That’s what makes them production-ready not just powerful, but understandable.

The Future

The future isn’t just better models or faster GPUs. It’s systems that understand context, remember interactions, validate their reasoning, and learn continuously. Graph memory is the foundation, but the real magic happens when you combine it with cognitive architectures that can reason across multiple dimensions simultaneously.

Boom. Done.

-– KRED

Makes me laugh that I have Ai’s write things well I chat with them so that I can paste them. That is something else I need to get back to… remember desktop client that used my mouse… I bet today that agents are smart enough now to do it without me having to train a model lol. I should see than I can just sit in my chair and command the robot army… Nope that is another store for another time.

I do have more information and videos on our discord. if you search they still should be an active invite here some where.

PS. @canukguy1974 I am still interested in looking not sure if you got my last ping haha. I needed more details so I can tell if it would be better than what I have which is why I posted the above so you could see. currently my avatars are small but they are using full video generation models local. cheers. Ps. I see you could be Canadian too good year btw. should pm me maybe you live near here haha.

Topic		Replies	Views
Can Self-Learning Agents Achieve Limited Self-Awareness? Community gpt-4 , chatgpt , ai , openai	3	944	January 29, 2025
What Are You Building? (2025 Projects Hackathon Thread) Community projects	218	9028	February 16, 2026
The Elephant in the Room: Why No Persistent Conversational Memory in LLMs? Community feature-request , memory	68	4474	August 18, 2025
Moonshot - Predicting the future and making JARVIS! Community	67	8219	November 25, 2023
AI Operating System for PC Community gpt-4	35	1696	March 8, 2026

Kruel.ai KV2.0 - KX (experimental research) to current 8.2- Api companion co-pilot system with full modality , understanding with persistent memory

Related topics