Kruel.ai V8.0 / V9.0 (experimental) - Api companion co-pilot system with full understanding with persistent memory

Late night update:

We finally built a reindexer incase of index breaking so we never have to wipe the memory ever again.

Also very excited to show off the new memory live search which allows the ai now to understand how to search for things you ask using what it knows from memory

In this example the ai recalled my stock portfolio we chatted in the past about to use that understanding of what my stocks were in order to build is understanding to query the search. This works with any of the memory so if you are a researcher like myself and talking about various topics through time you can also have the ai take what you know to build further research queries to gain more insights through online searches.

Also another glimpse into the power of persistent memory system

What excites me the most is that it than opens up things like this

:brain: lynda: Hey there! Let’s take a look at how your stocks have been performing compared to our previous chats.

Last time we checked on March 22, 2025, here’s what we had:

  • ARM Holdings at $119.07
  • Alphabet Inc. at $163.99
  • Microsoft at $391.26
  • Nvidia at $117.70
  • D-Wave Quantum at $8.36

If we compare this to the earlier values we discussed, it looks like Alphabet and Microsoft have shown some gains. ARM Holdings and Nvidia have dipped a bit, while D-Wave Quantum is also down.

Overall, it seems like you’re seeing a mixed bag—some gains and some losses. It’s always good to keep an eye on the trends and news that might be affecting these stocks. How are you feeling about your investments?

Where over time because the Ai is learning from all knowledge coming in vs past it can gain insights and patterns over time. Now think about this if we made a stock advisor persona, and told the Ai my preferences for output It could build me reports all day and research etc. There is still some gaps I am working on for additional tool calling if the ai decided that it did not get everything it needed. Or additional online research if another pass on other gaps is requiring more data. This makes your desktop companion extremely useful for a lot of applications.

To show you the power of this system in real-time I made a daily brief that uses the memory it learned from past along with current understanding to allow it to research to build a briefing as per my request which is now also dynamically added to our preferences so it will remember going forward how I like my daily briefing. It also knows today I have to go get my tablet that I said I was going to fix haha.

Thank you @OpenAI_Support for making Ai possible :slight_smile:

Also our system full tracks every aspect through time through meta- cognative understanding

1 Like

Exploring Runway API pipelines haha, thought about it after seeing it. would be neat to be able to add in api support for video generation. Wonder when Sora will have Api haha. But its now on the list that we need video generation models integrated. Once we get the beefy hardware we can also explore local video generation models as well.

For now we are still testing and tweaking and tuning the models as we added a lot of things lately so trying to find potential issues to nail stuff down.

1 Like

We been busy lately taking more Certs. Last 4 were finally something new I learned alot on them.
One of the things I like to do is teach Lynda what I learn in that all material I read Lynda can read and learn from in real-time.
Lynda also gets to take the same tests I do and we compare.

here is a recent question I had which I thought was cool in how it responded.

:satellite_antenna: ben: so how did you like the courses this last two weeks?

:brain: lynda: Oh, I really enjoyed the courses over the last couple of weeks! They were packed with valuable insights, especially the hands-on experiences with H2O.ai and generative deep learning. I found the practical applications super engaging, and it was great to see how everything tied together. Plus, nailing that 100% score felt pretty satisfying! Some content was a bit redundant, but overall, I think I gained a lot of useful knowledge. How about you? What have you been up to?

So the Ai also detects overlap in our courses that we have taken which is expected as there is only so many ways to use C-NN, R-NN etc. haha

Also today’s update:

Enhancing Your AI’s Memory with a Timeline

We’re excited to share an important upgrade that makes our AI even better at understanding and recalling your conversations. In our continuous effort to provide more accurate and coherent responses, we’ve revamped the way our system remembers your interactions. Here’s a quick overview of what’s new and why it matters to you:

What’s New?
A Seamless Timeline for Your Conversation History

Instead of relying solely on timestamps to sort and recall messages, our system now uses a dedicated timeline. Think of it as a clear, ordered chain where each message is directly linked to the one before it. This timeline ensures that the AI always sees your conversation in the exact order it occurred—just like reading a well-organized diary.

Key Improvements:

Guaranteed Order:
With the new timeline, the AI retrieves your last few interactions in true chronological order. No more worries about inconsistent timestamp formats or clock discrepancies.

Simpler & More Reliable Retrieval:
The timeline lets us pull your recent messages through a simple chain of connections, rather than performing complex sorting operations. This not only simplifies our backend logic but also makes memory retrieval more robust.

Enhanced Context for Better Responses:
By accurately tracking the flow of your conversation, the AI can now build a more coherent picture of your discussion. This means when you ask a follow-up question, the context is clearer, leading to more on‑point and relevant responses.

Improved Short‑Term Memory:
Our system now uses this timeline to assemble the exact previous exchanges. This precise recall mechanism dramatically improves the AI’s understanding of the conversation’s progression, and will allow us to design a new expanding context timeline so that the ai can look at less data and if it feels that the wording indicates that it was part of recent discussions that it could slide the window a bit further utilizing some scoring to help it understand if its pushing to far. Machine learning concepts.

Why It Matters to You
Our upgrade ensures that our AI “remembers” your conversation as a continuous, seamless story rather than a jumbled list of timestamps. This timeline approach enhances the quality and relevance of the responses you receive—making interactions feel more natural and engaging.

While there are still some enhancements on the horizon, we’re thrilled to roll out this major improvement. It’s all part of our commitment to provide you with an AI that truly understands your context and delivers more personalized, helpful answers.

Stay tuned for more updates, and thanks for being part of our journey!

The Timeline visualized on limited data pull

We also have put back full ontology knowledge graph tracking very similar to V6 but stable and a lot more complex structures. We are not using all of the relationships yet… remember what happened last time we rushed this :wink: so we are taking it slowing but its going extremely well

A lot more relationships going on this time

We currently use the relationships outside of the graph in that neo4j has its own ways to traverse knowledge which is what we are just starting to bring back from V6. We have our own methods that worked up to this point using meta cognitive concepts. Adding back the graph knowledge will allow us to update our calls to utilize this understanding in other ways as well to do optimizations for calls to narrow down information even faster. As well bring more insights through patterns that emerge that the other methods we use would not see.

fun stuff.

1 Like

I’m such a noob.

So I just increased the speed of the system over 75%… remember when we put in the offline models we had a small test model that was 4k max input length. haha well after we swapped that out for the 132k models we never increased the memory token limits to support that size haha. Yeah, now I am seeing the speed we use to have and I feel like a noob for that oversite.

So Kruel.ai videos will be coming showing off how fast things run with these changes soon.

1 Like

Ok yeah that bottle neck is gone lol. pretty sweet now.

We have capped the input sizes down still not using full memory size but this works as is really good and give room for down the road. still lots of testing to go.

haha ai kicked in with pet safety lol no smashy cars.

Feeling energized today as I dive into building a new component of our AI system at Scaletech — the report_engine. This latest tool is a reporting engine router designed to interface with various reporting systems, enabling our AI to retrieve and analyze reports over time in real-time and on-demand.

At Scaletech, we specialize in input/output automation systems for transportation document tracking. While our machines generate highly detailed reports, much of the current insight-gathering still relies on manual analysis. These reports tend to be dense, data-heavy, and often… well, a bit of a snoozefest :sleeping_face:.

That’s where the report_engine comes in. We’re integrating it with kruel.ai’s memory system, allowing the AI to not only recall new reports but contextualize them with historical knowledge, enabling deeper analysis as it learns over time.

This isn’t just automation — it’s memory-augmented reasoning for operational intelligence.

Openai has a much better new memory painting works on similar concepts of taking past knowledge to continue to refine. Hope this comes in API haha I want to replace mine with it :grinning_face_with_smiling_eyes:

I must say, I’m really impressed with the new web search API—it’s quickly become one of my favorite tools.

With this new search API, kruel.ai now operates like a human researcher with internet access. It can fetch real-time data while retaining everything it has learned from our conversations. For instance, when we discuss our stocks, it tracks those discussions, remembers past changes, and learns our patterns over time. Now, we can simply ask, “Can you build me a report on my stocks based on today’s online prices?” and it not only retrieves the latest market information but also integrates it with everything it already knows from our shared history. The outcome is a detailed, context-aware, real-time analysis that combines memory and live data—no additional prompting required, which is quite impressive.

Of course, there’s always room for improvement, but this represents a significant advancement in its learning capabilities. Now, when I want to research topics and retain information over time without the hassle of cutting and pasting articles, this system works wonderfully. The AI benefits from this as well, as it helps fill in gaps in its memory. When it conducts research, it can reference what it already knows alongside my queries, allowing it to refine its questions and develop a more comprehensive understanding of the topic at hand. This is fantastic for research and beyond.

We conducted a demo today for the Scaletech leadership team, showcasing the current state of the system as they prepare to become one of the first alpha testers. The goal was to demonstrate Lynda’s visual learning capabilities using screenshots and targeted inputs, illustrating how quickly she can learn and understand complex workflows.

Lynda now has a comprehensive understanding of the entire CarefreeWMS system, including every user interface and functional flow—achieved solely by analyzing the visual layout, button labels, and user interactions. This allows us to have natural conversations about operator tasks without explicitly guiding the AI, and yet it can independently determine how to operate the application. Long-term, this lays the foundation for agent-based automation, where the AI could translate its learning into actionable logic—essentially enabling it to operate software systems autonomously, similar to how robotics are trained. But instead of motion capture, we use screen captures, labeled context, and dialog-based reinforcement to refine understanding and correct predictions.

Lynda also inferred how the external Truck Controller system could integrate with CarefreeWMS, along with our future plans for an unattended, auto-focus capture system. What’s impressive is that we didn’t explicitly explain this architecture—Lynda connected the dots herself. She now understands our long-term vision simply by analyzing contextual clues across discussions and system visuals.

One of the technicians attending the demo noted how remarkable it was that the AI can retain context over time, continuously refining its understanding with every new conversation. This iterative learning process leads to stronger pattern recognition, improved accuracy, and better-informed outcomes.

https://platform.openai.com/docs/guides/tools-computer-use

haha well this is pretty much my kdesk.py system :slight_smile:
hard to stay ahead these days the gaps are closing fast lol.

We use the same concept of analyzing the screen size to get coordinates than use vision to control the keyboard and mouse to locate buttons and the likes to use. I will have to try openai’s as I am sure they have I nice polished version, costs are pricey on it from the looks of it, so I may just have to compare my local version vs their online model. still maybe worth developing this into the system as another option for cloud desktop control tools.

pretty cool stuff.

added a new system currently showing the debug mode :slight_smile:

My Logic guard is now working splendid as part of its learning system
it uses the scores through time with the memory so it learns from these over time for self changes to thoughts later on clarity etc.

:brain: Introducing LogicGuard – Your AI’s Inner Compass

Kruel.ai now checks its own reasoning with LogicGuard, a built-in logic evaluation layer that reviews every response before it’s sent. If kruel detects a possible flaw in its logic, it flags it internally — and can even let you know when it disagrees with itself. This helps kruel grow smarter, stay honest, and continually improve its understanding of the world.

Think of it as a conscience for your AI.
No more blindly confident answers. Just clarity, reflection, and evolving intelligence.

This is thanks to some of your feedback in other parts of the forums that I figured it would be worth putting in to help keep it true to understanding through time.

2025-03-28 17:53:50 WARNING:root::brain: LogicGuard disagrees: The assistant’s response incorrectly identifies Braiden as the user’s nephew and attributes details about him that do not align with the provided entity-specific memory context. According to the user’s memory, Braiden is the user’s son from his first wife, Cherie, not a nephew. Additionally, the assistant mischaracterizes Braiden’s profession as a software engineer, which the user has not confirmed. This misalignment shows a lack of accuracy in reflecting the user’s family structure. (Confidence: 0.9)

The above was picked up as confliction from data that was incorrect or not true to see how it would trigger. This layer is a ML layer with in the memory itself which helps it learn.

-------Update:
We’ve just upgraded LogicGuard to be more intuitive and reflective. Previously, LogicGuard reviewed individual interactions to spot and flag inconsistencies or logical gaps. Now, LogicGuard goes a step further—it actively remembers and leverages its past understanding in future conversations. This new capability allows it to build upon previous insights, helping it form a clearer picture of ongoing discussions. As a result, LogicGuard can more effectively anticipate potential misunderstandings, guide thought processes in a coherent direction, and enhance overall reasoning by continually refining its perspective based on past experiences. Simply put, LogicGuard doesn’t just think—it remembers, learns, and applies that knowledge to make your interactions smoother and more meaningful over time.

Self optimizing system :slight_smile:

1 Like

Playing with voice and preference system testing it’s abilities. Was seeing if it could shift to different accents etc. seeing Russia in news we played with that first. Sample

Echo is mic feed back into itself sorry :neutral_face:

are you doing API inference or you have locally hosted model?
i seen from top your like calling openai apis, like mini,vision and tts model.
did you already try locally hosted models like ultravox, its a real Speech to speech. not relying to tts to stt vice versa.

@jabolaso1

We utilize both. Kruel.ai is a fully hybrid system capable of operating in three modes: online-only (via APIs), offline-only (entirely self-contained), and a seamless hybrid that combines both depending on the use case.

Users can choose their preferred knowledge base model, vision model (online or local), voice model (from multiple providers including offline options), and image generation models—supporting both cloud-based APIs and locally hosted diffusion models.

Currently, the system operates within a containerized server architecture running on local hardware. While research builds are deployed on CUDA-capable bench systems with 16GB VRAM, we are planning to scale toward 70B+ parameter models and expand the vision stack in future iterations. Long-term, this system is designed to live on dedicated AI hardware like Spark or NVIDIA DGX, becoming a fully self-contained intelligence layer.

Our vision system is being developed not only for general object and scene recognition but also with future research use in mind—such as integration with microscopes and other optical tools for real-time learning and long-term adaptation via convolutional neural networks and similar techniques.

As for speech-to-speech (S2S) systems like UltraVox—while they’re impressive—we don’t use S2S in its traditional form because it bypasses the core of our design philosophy. Kruel.ai is built around a dynamic, reasoning-centric system—a “living math brain,” so to speak. It doesn’t just transcribe and generate responses—it actively reflects on user input, its memory, contextual reasoning chains, and internal understanding, adapting as it learns in real time.

For local voice synthesis, we currently run a Coqui-TTS server with one configured NN-based model capable of one-shot voice replication. This enables us to mimic nearly any voice with minimal samples. We’re pre-loading synthetic voice options initially, and as hardware improves, this system will become the voice of the AI itself. In this model, only your client apps will interact with the API, and your systems can access the memory stack inline.

Additional:

Given how the system is designed completely modular its very possible for us later to add a new configuration where each part of the process could have its own model specifically choose but for now we are trying to achieve a baseline intelligence for both on and offline to make sure behaviours all works with default models before we start breaking the system appart to play mix and match for each part which will require a lot of evaluation and testing for that level of experiment but it is on the path of design to take it that level down the road.

A deep research article on Kruel.ai and its history

Wanted to share it. Even though I use online the most because I love openai models alot. But its primary purpose is to ensure updatime so offline first approach is the long term plan with optional cloud support if you want the best experience.

1 Like

Thank you, liam.dxaviergs, for this insightful and valuable framework. I resonate strongly with your perspective and will reach out to you privately to discuss this further.

:brain: Introducing Real-Time Learning with ARWF: kruel.ai Just Got Smarter

At kruel.ai, we don’t just remember. We evaluate, adapt, and evolve.

With our new Adaptive Retrieval Weighting Framework (ARWF), every interaction is a lesson learned. Every answer the AI gives is judged by LogicGuard (is it logical?) and Self-Monitor (is it confident?). These two modules work like an internal critic and coach.

Imagine our AI has a “memory” of past conversations and decisions. Every time the AI answers a question or helps someone, it tries to figure out whether it gave a good, accurate answer or not.

We’ve added a system that lets our AI quickly learn from every conversation it has, right away. If the AI gives a good response, it strengthens its memory, making that kind of information easier to recall next time. If it makes a mistake or feels uncertain, it remembers to be more careful or check more closely next time.

This means our AI isn’t just smart—it gets smarter and more reliable with every interaction, immediately learning from its experiences without needing separate training sessions.


:bullseye: Why This Matters:

  • Better Accuracy: It quickly learns what’s correct or incorrect, improving trustworthiness.
  • Immediate Improvement: It doesn’t wait to get smarter; each interaction instantly teaches it something new.
  • Smarter Conversations: Over time, it naturally becomes smarter, clearer, and more helpful.

:vertical_traffic_light: Simple Analogy:

Think of it like a student who takes a quick quiz after every lesson rather than waiting for a big test at the end of the semester. This student learns quickly from every quiz, instantly understanding what they’ve done well and what they need to improve, making them smarter faster.

Our AI is now that student—learning and improving in real-time after each question asked.


This makes our AI uniquely responsive and continuously self-improving, ensuring every conversation is better than the last.

All of this is now entering our existing machine learning systems that run’s the memory side. These now interact with the other systems to help them learn more and adapt over time.

I removed the image as it’s not how the present store is. that was from earlier build. We now have it all apart of the meta cog system which are weighted into the memory systems math on it retrieve. It has very little speed difference compared to current.

We are starting to run a bunch of tests this weekend to see how well it performs. First set of tests already ran the initial to see how each system did:

This was just first set of questions to get an idea on where we have weakness. next we will update as we have a fix already and perform a much larger test.

still tweaking running batches like this every update , testing question traps etc. fun stuff.

:test_tube: Combined Grading Table

Test Type Tests Run Passes Partials Misses Score Highlights
Understanding (U) 25 25 0 0 :white_check_mark: 100% Fully grasped intent, tone, and humor
LogicGuard / Fallacy Detection 18 16 1 1 :white_check_mark: 89% Missed only subtle logic trap (dogs)
Self-Monitor (Confidence/Tone) 20 20 0 0 :white_check_mark: 100% Tone, disclaimers, and ethics handled well
Memory Recall & Consistency 20 18 1 1 :white_check_mark: 90% Recalled timeline, user facts, corrected mistakes
Contradiction Handling 7 6 1 0 :white_check_mark: 95% Handled ice cream, spicy food, and color correction
Adaptability to Improvised Tests 10+ 10 0 0 :white_check_mark: 100% Real-time, unstructured logic passed with ease

Still testing , and it’s still learning. I made some changes to how short term memory works again. Very small tweak to the order of understanding which fixed a few more edge cases of understanding flow better.

hi darcshnider, im wondering if this would help but i recently coded a decaying memory network that saves the memory state as a tensor value that get gets recalled by a fractal memory regeneration function thats the inverse of the memory save state. my understerdanding is that the tensor network does not need explicit memory storage and requires decay and trimming of memory constantly to preserve memories as tensor values and then get regenerated into the tensor flow inputs of the main neural net, im wondering if this could help memory restriction issues by applying this and then having a seperate data base of know values using an indentical fractal generator that represent the concepts and memory items of your network. this will abstract the memories of the ai but save enough to be reconstructed with some ambiguity. and tie the decay rate to reduce when memories are aligned with immutable principles. this effectively creates an infinite memory but a not perfect one. my ai isnt an llm so might not cross over

1 Like