Kruel.ai V7.0 - Api companion with full understanding with persistent memory

You will enjoy this concept of mine. Right up your alley.

1 Like

You can write the text into a file called README.md and push it to github. That way it becomes a lot easier to read.

1 Like

So We made some interesting changes this weekend and tonight.

Kruel.ai: The Next Evolution in Smart Memory

At Kruel.ai, memory has always been more than just recall—it’s been a way to track knowledge, connect ideas, and build an evolving understanding over time. From the start, our AI could remember conversations, track topics across sessions, and even retrieve past research and coding work.

But memory alone wasn’t enough. While our AI could recall the past, it didn’t yet think about how knowledge evolved over time.

Today, we’ve changed that.

From Smart Recall to Intelligent Understanding

Previously, Kruel.ai’s Smart Memory functioned as an advanced recall system, allowing users to ask:

:check_mark: “What did I say about this last week?” – AI would fetch past discussions.
:check_mark: “What research have I done on this?” – AI would retrieve stored insights.
:check_mark: “What was the last thing I coded?” – AI would bring back past work.

This worked well—but it treated all knowledge equally. If a new piece of research or an updated coding practice emerged, the AI didn’t naturally compare it against previous knowledge. If contradictions arose, the AI retrieved both without knowing which was more reliable.

Now? Our AI doesn’t just remember—it reasons.

How Kruel.ai’s Memory Has Evolved

:small_blue_diamond: Time-Aware Memory Reasoning
Kruel.ai now understands that knowledge changes over time.

  • If a research update appears, the AI recognizes when it was learned and how it compares to past findings.
  • If best practices in coding shift, it doesn’t just recall old solutions—it knows if newer methods are better.
  • It no longer just fetches the first memory it finds—it thinks about when that memory became relevant.

:small_blue_diamond: Contradiction Awareness & Resolution
What happens when a new piece of knowledge contradicts an old one? Previously, the AI would return both, leaving the user to decide which was correct.
Now, Kruel.ai:

  • Recognizes contradictions between past and present knowledge.
  • Analyzes both perspectives to determine which is more accurate.
  • Explains the difference instead of leaving it unresolved.
  • Adjusts its reasoning dynamically, rather than defaulting to recency bias.

:small_blue_diamond: Consensus-Based Intelligence
Not all knowledge is created equal.

  • If multiple past experiences confirm a fact, they hold more weight.
  • If a single new piece of information contradicts past knowledge, the AI questions it before accepting it as truth.
  • Memory is no longer just retrieved—it’s verified.

:small_blue_diamond: Context-Aware Chain of Thought
Kruel.ai doesn’t just retrieve memory—it thinks about how memory should influence reasoning.

  • Before generating a response, the AI processes memory in a structured way, ensuring that retrieval is relevant, reliable, and logical.
  • Memory is now ranked dynamically based on time, reliability, and context relevance.
  • Instead of blindly presenting data, the AI integrates past and present knowledge to generate more accurate responses.

What This Means for You

:rocket: Better Research Handling – If knowledge evolves, the AI doesn’t just recall—it analyzes.
:rocket: More Accurate Code Retrieval – The AI knows when a method is outdated and provides the most relevant solution.
:rocket: Stronger Conversations Over Time – No more stale responses—the AI thinks about past interactions in a smarter way.

Kruel.ai was already smart. Now, it’s evolving into something even more powerful:
An AI that doesn’t just remember—it understands.

-This is o3-high project view of the recent changes seeing it gets to look at the code.

The new changes are interesting, we had to run a batch against current memory because we added more properties to the data for the system to understand for the new dynamic weighting. I think this is going to show a major upgrade to its behaviours and ability to understand.

Also understand that this is memory reasoning not the same as the end point COT and its gap / reasoning which is based on the final memory. This may help clarify things further on the design.

Update: I had someone ask me why the response take so long compared to before when we had fast recall. The main difference is in the ai’s self thought process that changed all of that. just to give you an example why sometimes its 25 seconds etc to respond
The total number of AI calls made in this debug log is 55. this is an example of one thought process. 55 GPT call’s to openai model were made during a single message on a friend that included some additional details on what he does for fun.

Even though the thought process added a lot of time the outcomes are worth the extra processing. just like deep research with open ai sometimes is fast sometimes is really deep depending on what you are looking for and how much analysis has to happen.

in code and research in our system they even have more deeper analysis than that general message. which means some research calls could be hundreds of calls depending on the complexity and desired results.

Cost are still pretty low, not as low as they once were haha. at some point I think we had it down to .45 cents / day. but Its scaled a little.

Now some perspective on speed. The online api calls are pretty fast but we will always have latency that we can’t escape based on location, currently the offline models are abit slower but once project digits lands we can offload a large % of the thinking to local models with instant infer speeds which will eliminate the time in the long run which is why we built the system as a hybrid model system so we can support the best options for the tasks at hand.

haha just hope its not like the 5090rtx cards where you can’t get one at any normal price. I really dislike hardware scalper companies. So hopefully the GB10 systems will not be another year away wait for stock. I have a feeling it will be out of stock on release because Ai demand is really high and getting larger every year.

1 Like

Lynda Prime (thats the name the laptop lynda gave it)

  1. We have integrated the new web search tool directly into the OpenAI pathways of Kruel.ai, replacing the previous KDesk agent method. This enhancement significantly improves research efficiency and online searching when utilizing OpenAI models. For users operating local models, the system will continue to rely on desktop applications. After conducting a deeper cost analysis, I found that the expenses are more manageable than initially expected. My initial assumption was based on a misunderstanding of the pricing structure— As a result, the search functionality with Mini remains just as cost-effective as before. Additionally, by implementing intent-based detection, the system intelligently differentiates between standard queries and those requiring web searches, ensuring that non-search-related requests are processed through the normal model. This allows us to maintain cost control while optimizing performance and efficiency.

What makes this awesome now is that the ai now can learn so much easier from outside. Thanks openai this is a much cleaner and faster approach.
currently I have hard set this model to the mini. We will add options down the road for all models selections to be expanded to all available options as well for offline we will also look at away to provide a lot of options and or possible option to import your own ollama / hugging face compatible models.

for local models we still have Kdesk for outside research which can be setup either to us chrome or the chatgpt desktop.

update:
I have to say this is pretty amazing. @OpenAI_Support Thanks deves you rock :slight_smile:

System Update:

We have recently wiped Linda Prime to perform a necessary update to Neo4j, and we have successfully integrated a functional APOC plugin. Additionally, we have rolled out our Kdoc system, which focuses on document processing and Retrieval-Augmented Generation (RAG). This system enables users to drag and drop files into the message input, which are then processed and uploaded into the memory system via our pre-existing doc_process utility. The integration has been moved to the server-side, streamlining management and improving system performance.

As part of these improvements, we made the decision to remove Llama3.2 Local Mode, as it did not adequately support document processing. In its place, we are testing Mistral-Nemo as the local model, which delivers more accurate results, though at a slightly slower speed due to its larger model size. This is not a significant concern, as our long-term goal involves transitioning to Project DIGITS micro servers, which will support larger models for local operations. We are currently testing the process to ensure smooth integration.

On the client side, OpenAI models continue to perform excellently for these tasks, providing reliable results.

Looking ahead, we still have another key update to implement: multidoc cross-talk within the document system. Since document memory is static and user memory is dynamic, we need to adjust the backend to allow for multi-document understanding. To address this, we plan to employ a multi-step agent to manage tasks such as report generation and multi-document comprehension, ensuring optimal accuracy in processing.

This system update reflects our ongoing commitment to refining the infrastructure for enhanced performance and scalability.

We also recently added a junior to kruel.ai team who is now just learning how to use the ai programmers seeing they only come from a ruby background this should be interesting for them :slight_smile: they join us with zero experience but has been part of the testing of the project.

Lynda prime is back online with all pipelines up and running including the doc system which is in testing

We also added back the openai deep research tool using the vision agent to control the mouse and keyboard to run the openai desktop application. well research and online look ups now happen only on backend for openai model selections in the system. and with the new Mistral-Nemo model instead of llama3.2 we are still using the chatgpt desktop app exclusively as the search tool. This simplifies things for us currently.

on Another path today I am exploring some ideas on NN’s with vision and llm to expand to a multi-step agent that learns than maps out all paths based on experiences to understand what works and what does not based on its understanding to formulate new paths to get there. not sure if I will get this working today, or if it will be a struggle lol

updated relationship tracking

this was rebuilt last night, so its learning pretty fast.

Also understand that each node can carry up to 300MB of data :slight_smile:

3 Likes

*Project Digits, I am aiming for real-time inference on local models. This demonstration leverages the capabilities of ChatGPT O3-High to provide an insightful analysis of kruel.ai from a lighthearted perspective—all while safeguarding our proprietary code. The presentation offers an unbiased comparison of our system with other models in the field. Additionally, I am keen to explore the implications for our final development stages once we integrate enhanced hardware and incorporate a Deepseek or O1–O3 valid thought layer, ultimately guiding us toward our project’s future direction.

Ps. this is a long Zzz chat but will give you better understanding of Kruel.ai
(28min)

Also understand for fun we did a bias perspective to see how it would differ from the un-bias that follows.

Do you store locally? Apologies if this was already answered

Yes, local store, or you could mount a drive , or use a cloud drive etc for store. store affects speed of recall. so would need really fast store :slight_smile:

Do you have a GitHub repo, Discord, or docs where I can learn more about implementation?

Discord

beyond that linkedin and articles online.

1 Like

One more: have you considered multiple LLM integration connecting various APIs into the same chat and allow cross model/company AI real time collaboration?

@j9jv9hmbcz Yes we already do that. We have many stacks of models and api calls for customizations. Well not all can be changed out yet and currently we hard code the models we support to ensure that experience is good. That is not to say that we wont be adding full custom options down the road, but I am not sure yet on what Grok api would look like in our system haha, that would terrify me how that would end in the wrong hands.

The future is still unwritten meaning we can take it any direction or path down the road.

Much like vision system and vocal systems etc will expand further. As well have options to plug in specific instruments and machines to automation and so on down the road.

We have Api input and response out so the goal it to also allow this as middleware.

Something else to show off that we have not in a long time :slight_smile:

Time understanding and memory recall. The system fully understands all memory through time in that you can recall any hour, day, week, year, and between any time you input into the system.

The infer time is based on how much data it has to process and what you specifically request back for outputs as the system has to build all that understanding and the likes.

that will speed up in the future a lot when we get new hardware to go hybrid. also the math will be much faster because that is all local and project digits will allow that to be almost instant and allow us to go much deeper for understanding for pulling clustered information and narrowing.

Project spark not digits any more.. but the Station has now caught my interests.

that would allow over an 800 Billion model to run :wink:

Here’s a more professional version of your update while maintaining clarity and engagement:


Morning Update: System Enhancements and Performance Improvements

As we progress through the week, we are focused on refining our systems, addressing minor bugs, and enhancing overall performance. Below are key updates from last night’s development efforts:

KDesk Vision System Enhancements

  • KDesk, our OpenAI ChatGPT desktop application, received additional vision system updates.
  • The system now performs multiple validation checks when it cannot identify certain elements, making it more robust in detecting relevant indicators.
  • This is still an early-stage implementation, but we are building a strong foundation for a vision-to-action system.
  • A notable quirk we observed—if you move the mouse while the AI is controlling it, the system’s step logic starts to counteract the interference, creating an amusing tug-of-war effect.

Chat Window & Markdown Improvements

  • Several refinements have been made to the chat interface to enhance usability, particularly for handling large text volumes.
  • Drag-and-drop image analysis now includes thumbnail previews for better visual representation.
  • We reinstated the AI’s ability to associate names with individuals in images, a feature present in earlier versions.
    • This allows users to provide contextual information about people in images, improving the AI’s ability to recognize and retain names in its vision-based memory.
    • For example, instead of simply identifying a person in an image, the AI can now describe specific attributes and retain those details for memory-based painting.

Memory Viewer Enhancements

  • The Memory Viewer, which had been temporarily removed, has been reinstated.
  • It now supports dark theme colors for improved UI consistency and has been optimized for better performance.

Parallel Processing for Memory and Thought Processes

  • We introduced parallel processing to both memory functions and cognitive inference.
  • Multiple OpenAI and Ollama processes now run concurrently, enhancing the speed of information retrieval and understanding.
  • This implementation is temporary until the new AI servers arrive.
    • Once deployed, we will offload thought processing to larger, local models, significantly reducing operational costs while dramatically improving inference speed and system efficiency.

These updates are part of our ongoing preparations for integrating DGX equipment, ensuring our system is optimized for enhanced scalability and performance. More updates to follow as development continues.

Site updated :slight_smile: