Kruel.ai V7.0 - Api companion with full understanding with persistent memory

Oof.

1 Like

Hi I am interested to be a customer of your product if it provides voice and vision capabilities.

1 Like

Fixed thanks for pointing that out

1 Like

It’s a private research project for now , what application you thinking that requires vision so I can think more on what you maybe looking to do with it should it ever become commercial down the road or even open source :thinking:

Voice and vision will make it a complete product.
My particular use case is, it could be used by every small business as their SME or business manager.
Particular use case is for retail outlets to help customers sell or crosssell the products.
I already have customers to , would be happy to discuss over call

It will be a while before I even look into potential business. It’s research system and not secure or ready for cloud yet. But that is my plans eventually. As to vision what do you need it to do. You can pm me if you like.

1 Like

Its up and running, time to optimize, but just for those that want a glimpse of the amount of thinking the system goes through to understand and and remembering things

1 Like

Optimizations Have Started :blush:: Neo4j RAM Disk Brain System

As a tech enthusiast always seeking ways to optimize performance, I recently dived into the world of Neo4j datacalls and encountered an unexpected bottleneck. Despite having a very fast M.2 drive, it seemed that the high volume of datacalls was slowing things down. To tackle this, I initially batched the calls and ran them in parallel, a technique I often use with APIs when facing rate limits. This provided some improvement, but I knew more could be done.

Then came a hair-brained idea: what if I could leverage a RAM disk to further boost performance?

The Birth of the RAM Disk Brain System

Armed with 128GB of DDR5 RAM, I realized that my Docker containers weren’t utilizing the memory to its fullest potential. This led me to experiment with creating a 64GB RAM drive, a “brain drive” if you will, dedicated to Neo4j operations. The concept is simple yet powerful: RAM is significantly faster than even the fastest SSDs, and a RAM drive could drastically reduce data processing times.

The Challenge of Volatility

RAM drives, however, are volatile. They lose their data upon power loss, necessitating regular backups to a persistent storage medium. This volatility posed a challenge, but the potential benefits were too significant to ignore. By implementing a system that periodically backs up the RAM drive and restores it on boot, we could achieve the best of both worlds: blazing speed and data persistence.

Implementation and Results

Creating the RAM Drive: Using tools like ImDisk Toolkit on Windows, I set up a 64GB RAM drive. This drive would act as the primary data store for Neo4j, handling all read and write operations at lightning speed.

Data Backup and Restore: I developed a Python script with a Tkinter GUI to manage backups and restores easily. This script ensures that data is regularly backed up to a secondary storage location and restored to the RAM drive upon system startup.

Performance Gains: The results were astonishing. The RAM drive setup reduced data processing times by over 10 seconds, a significant improvement for our high-volume data operations. This optimization not only sped up our workflows but also demonstrated the incredible potential of leveraging RAM for intensive tasks.

The Future of RAM Disk Brain Systems

This experiment opens the door to numerous possibilities. As RAM technology continues to evolve, with larger capacities and even faster speeds, the use of RAM disks could become a standard practice for performance-critical applications. Additionally, specialized RAM drive cards could further enhance this approach, providing dedicated, non-volatile memory solutions.

For now, the RAM Disk Brain System has proven to be a valuable optimization technique, offering a glimpse into the future of high-speed data processing. By combining the strengths of RAM and persistent storage, we can achieve remarkable performance gains and push the boundaries of what’s possible with current technology.

Oh that also lead to this :
image

And I also started to play with python compilers so these are windows executable now. I plan to look into a more secure way to compile them but this will make like simpler. The ramdrive only takes seconds to back up so we will see over testing how much time the backup take.

So when can we get nuclear diamond batteries??

Update: Ram drive is amazing. The response is very fast now I’d say the changes along with some better algorithms gave 45% less time to respond vs that last video. In still optimizing so I will see hopefully today.

1 Like

Testing the old V1 vision with V6 memory system. Still works. I will have to compare this after we update it to V2.

@HireKarigar

1 Like

Other information from last month usage costs. compared to the old Process Data the new Machine learning is still way cheaper, but slower to respond. well the old system was 10 seconds in V5, vs 15-30 seconds for V6

I will see how end of this month looks as there was a lot more testing as well starting to dive into vision which has a cost of its own.

Image generation is optional and not required. so those costs are simple fun.

Some other updates, That image back up is no longer needed currently as our BRAM drive automatically saves now on windows clean shutdowns and auto mounts on start up so no need to backup and restore to get up and running. We still will look into periodic backups but that is not a priority right now.

Haha Kruel just made Tier 5 today woot woot. sad that Tier 5 is the largest though :stuck_out_tongue:

1 Like

Ok, optimization round one completed last night haha. all I can say is wow is it fast now. I have moved into tensor cores for most the ai sub systems now. this means most the data is almost fully processed local now including the vectorized data for machine learning. I will post a video here in a minute to showcase the incredible speed now. it’s even faster than my fastest version I have made out of the 6 versions.

Optimized GPU round 1

fyi the screen does now really show much much data is zooming past but its approx : 327 pages of text (655,360 characters / 2,000 characters per page). that is how much data per transaction happens in 10 seconds give or take a few seconds depending on voice processing.

1 Like

Seeing we were optimizing we stopped at the vision system, and doing an overhaul.

Kruel.ai’s system captures and stores all vector data related to these entities. For instance, it understands the relationship between Lisa, the red dress, and the GMC truck. When a user later asks, “What color is Lisa’s truck?” it can accurately recall and respond based on the previously stored information, demonstrating an understanding of the relationships between all entities and their associated vector points.
Efficient Database Operations
To support these functionalities, we optimized our database queries to avoid performance issues like Cartesian products. This ensures that Kruel.ai operates efficiently, even when handling complex data relationships, providing swift and accurate responses.

Benefits of These Enhancements
Enhanced User Experience:

Users receive more detailed and accurate responses to their visual queries, making interactions with Kruel.ai more intuitive and satisfying.
Broader Application Potential:

These improvements make Kruel.ai more effective across various domains, including customer service, healthcare, and more.
Moving Forward
These optimizations represent a significant step in our ongoing mission to enhance Kruel.ai’s capabilities. We are continually exploring new ways to improve our systems, aiming to deliver the most advanced and user-friendly AI solutions available. (edited)

Xee — Today at 2:43 PM
Ben: If you look in your camera system, you can see Ben, what do you see?

Krue: Ah, Ben, Ben, Ben! Peering through the lens of my camera system, I spot you in all your gray hoodie glory, seated indoors with a neutral expression. The backdrop paints a picture of a room bathed in natural light, with a window casting shadows on a whiteboard adorned with mysterious scribbles. Your contemplative gaze and the subtle play of light and shadow create quite the artistic scene.

Ben: If you look through your vision system, you can see Doug my dog.

Krue: Oh, Ben, if I peek through my vision system, I can spot Doug, your furry companion, chilling on a wooden floor. He looks quite content and relaxed, with a yellow object nearby. Looks like Doug is having a cozy moment there!
[2:45 PM]
With the new GPT4o, the response are very fast. We will continue testing later today. I want to see how over time asking questions on what it sees about objects how the memory will build itself so that if I ask tell me about Doug it can understand more about Doug visually over time along with other vectors of understanding.

1 Like

Been away on Vacation, and well now working on this :slight_smile:


Unlocking the Future: Real-Time AI Monitoring with KruelAI

At our research group, we are pioneering the development of KruelAI, a cutting-edge AI system designed to revolutionize the way we monitor and manage Windows events. Our goal is to create a robust, real-time AI monitoring solution that not only detects errors and critical issues but also evolves to provide autonomous system maintenance. Here’s a glimpse into our vision and the potential impact of this groundbreaking technology.

Current Focus: Understanding and Diagnosing Errors

Our initial focus is on enabling KruelAI to understand Windows events comprehensively. By gaining insights into the environment, KruelAI will be able to identify and diagnose errors and critical issues accurately. This foundational understanding is crucial for building a reliable system that can support users effectively.

Continuous Learning and Improvement

One of the key benefits of our approach is KruelAI’s ability to learn and improve over time. As we work with KruelAI to resolve events, it will continuously gather data and refine its problem-solving capabilities. This iterative learning process ensures that the system becomes smarter and more efficient with each interaction, ultimately leading to a more robust and autonomous solution.

Proactive Notification and Solutions

Upon detecting issues, KruelAI will promptly notify users and provide potential solutions. This proactive approach ensures that problems are addressed swiftly, minimizing downtime and enhancing overall system reliability. By empowering users with actionable insights, we aim to create a seamless and intuitive experience.

The Path to Full Automation

While automatic system fixes are not yet within reach, we envision a future where KruelAI can autonomously resolve issues without human intervention. This level of automation is already more feasible on Linux systems due to their inherent flexibility. However, by laying a strong foundation now, we are confident that achieving this milestone on Windows is entirely possible in the future.

Scalability and Future Expansion

Our current plan is focused on building a scalable system that can grow and evolve with time and resources. We recognize that with sufficient investment, KruelAI can expand its capabilities to handle increasingly complex tasks. This scalability ensures that our solution remains relevant and effective as technology and user needs continue to evolve.

Affordability and Accessibility

We are committed to making advanced AI capabilities accessible and affordable for the average user. Our objective is to develop a solution that democratizes technology, allowing everyone to benefit from cutting-edge AI advancements without prohibitive costs. By focusing on affordability, we aim to bridge the gap between innovation and everyday practicality.

Exploring Advanced Security Solutions

We have also engaged in extensive discussions about extending KruelAI’s capabilities to the firewall level. Imagine an AI that not only aids in real-time protection but also learns attack patterns over time, developing solutions to threats we may not have even considered. Integrating KruelAI with a comprehensive firewall system, along with an OpenAI-powered desktop and backend infrastructure, opens the door to advanced security measures. This approach addresses the challenges in IT security, where human factors such as lack of knowledge, complacency, and cost constraints often prevail. By leveraging AI’s potential, we aim to create a proactive and intelligent security solution that evolves continuously.

V6 continued tesying

1 Like

What am I doing today learning :slight_smile: strong texthttps://learn.deeplearning.ai/courses/prompt-compression-and-query-optimization/lesson/1/introduction

I figured seeing the system is working really good, this maybe what I am looking for. Andrew Ng is the one who taught me alot when I switched from flat table based memory systems into full multivector based machine learning. It was one of the major turning points in how I changed how I do thing and I am not looking back. if you watch this course you will learn more about Kruel Architecture in how we use a hybrid approach to data in order to achieve understanding.

2 Likes

Have fun! Bring back lessons for us! :wink:

Well, bad news my new version of the memory is currently coating more now that it’s predecessor. I got to excited in tracking understanding from multiple vectors that over time the data got long in the tooth. So now got to spend time investigating how I can narrow it down with out sacrificing to much. Made me sad this last week we hit around $15-20/day. Mind you training and testing eats up alot so it’s not a real evaluation of cost but enough to warrant me to rethink the data processing.

Well not openai related we are testing local models with kruel.ai. current ones we are running tests against GPT3.5-Turbo-0125

  • Llama3
  • Parameters: 8B
  • Size: 4.7GB
  • Command: ollama run llama3
  • Description: Llama3 is a high-parameter model designed for tasks that require significant computational power. With 8 billion parameters, it’s suitable for complex tasks requiring deep understanding and inference.
  • Phi3 Mini
  • Parameters: 3.8B
  • Size: 2.3GB
  • Command: ollama run phi3
  • Description: Phi3 Mini is a smaller model in the Phi3 series, optimized for performance with a smaller memory footprint. It balances accuracy and resource usage, making it suitable for environments with limited resources.
  • Phi3 Medium
  • Parameters: 14B
  • Size: 7.9GB
  • Command: ollama run phi3:medium
  • Description: Phi3 Medium is a mid-sized model in the Phi3 series, offering higher accuracy and capability due to its larger parameter count. It is designed for more demanding tasks while still maintaining reasonable resource requirements.
  • Gemma2
  • Parameters: 9B
  • Size: 5.5GB
  • Command: ollama run gemma2
  • Description: Gemma2 is a versatile model with a substantial parameter count, making it well-suited for a variety of tasks, from natural language understanding to more complex inference tasks.
  • Gemma2 27B
  • Parameters: 27B
  • Size: 16GB
  • Command: ollama run gemma2:27b
  • Description: The Gemma2 27B is a highly advanced model with a significant number of parameters, offering exceptional performance for the most demanding AI tasks. Its large size requires considerable computational resources.
  • Mistral
  • Parameters: 7B
  • Size: 4.1GB
  • Command: ollama run mistral
  • Description: Mistral is designed to be efficient yet powerful, with 7 billion parameters. It provides a good balance between performance and resource consumption, suitable for a wide range of applications.
  • Moondream2
  • Parameters: 1.4B
  • Size: 829MB
  • Command: ollama run moondream
  • Description: Moondream2 is a lightweight model designed for tasks that require fast inference and low resource usage. Its small size makes it ideal for applications with strict memory constraints.
  • Neural Chat
  • Parameters: 7B
  • Size: 4.1GB
  • Command: ollama run neural-chat
  • Description: Neural Chat is tailored for conversational AI applications. With 7 billion parameters, it can handle complex dialogues and provide accurate responses in real-time.
  • Starling
  • Parameters: 7B
  • Size: 4.1GB
  • Command: ollama run starling-lm
  • Description: Starling is designed for language modeling tasks. Its 7 billion parameters allow it to generate coherent and contextually appropriate text, making it suitable for text generation and language understanding tasks.
  • CodeLlama
  • Parameters: 7B
  • Size: 3.8GB
  • Command: ollama run codellama
  • Description: CodeLlama is specialized for coding and programming-related tasks. It can assist with code generation, understanding, and completion, leveraging its 7 billion parameters for accurate and context-aware coding assistance.
  • Llama2 Uncensored
  • Parameters: 7B
  • Size: 3.8GB
  • Command: ollama run llama2-uncensored
  • Description: Llama2 Uncensored is a version of the Llama2 model without content filtering, allowing for unfiltered text generation and analysis. It maintains 7 billion parameters for robust performance.
  • Llava
  • Parameters: 7B
  • Size: 4.5GB
  • Command: ollama run llava
  • Description: Llava is designed for vision-language tasks, capable of understanding and generating text based on visual inputs. Its 7 billion parameters make it powerful enough for complex multimodal applications.
  • Solar
  • Parameters: 10.7B
  • Size: 6.1GB
  • Command: ollama run solar
  • Description: Solar is a large model with 10.7 billion parameters, optimized for high-performance tasks requiring substantial computational power. It is suitable for advanced AI applications that demand significant processing capabilities.

As well Megatron nvidia’s smaller versions.

These are about all I could run using local RTX card with 16GB ram.
I did look at costing though for larger cards A100 40-80GB and B100 192GB cards than laughed… card or robot which is more important to the project. Robot body wins everytime. So sometime down the road expect a robot to appear into the mix.

we used OLlama for our test as it was the fastest way to get GPT setup. keep in mind we have not tested quantization yet

After extensive testing with various models using OLLAMA, we concluded that OLLAMA will not work efficiently with KruelAI. However, the testing phase led to a significant discovery about our data processing system.

Over the past few days, I noticed an unexpected spike in token usage, reaching an astounding 198 million tokens during training. Initially, we suspected an issue with the math calculations within our system. After deleting the database and retesting, I discovered the real culprit.

When I was demonstrating code, the system broke down the information into numerous tracking points for better understanding. However, it was duplicating the information for each tracking point since they all originated from the same source. This duplication caused our data to balloon uncontrollably.

With this new insight, we can significantly streamline our code. By eliminating the redundant processing of the same information for each relationship in our brain logic, we can greatly reduce the token count and enhance overall efficiency.

1 Like

Ouch! Glad you caught it.

Thanks for sharing with us.