V5.0 - Api companion with full understanding running 16k thanks to advanced memory system

some updates we decided to continue on the project. considering the system scales with the GPT4o and we still have a lot more functions they ChatGPT it makes sense to continue…

As to our current testing take a look from the old V5 memory to the new test memory over the last week on how efficient its getting for pulling data :slight_smile:

We gutted the system starting at the message level reworked the whole memory structure and relationships to build it a lot more optimized. GPT4o has helped a lot more than GPT4 in solving some of our earlier issues we got stuck on. here is new look at the basic structure that is a lot more optimized for machine learning and finding the data ahead of processing making the calls less than ever before. still running on GPT3.5 as the base model for costs with the understanding that we can scale it up.

We now have relationships between all data points for tracing information fast as well our time clouds now work properly with realtime updates keeping the system fully updated through time. We are missing a few things in this shot, the embeddings were temporarily removed. these are going to be refactored in once we get this part stable.


Wowzers, incredible effort. I applaud and empathize with your situation. I’ve been there before. In a way, I guess the entire human civilization is now where you are.

Quick question:

I just had gpt-4o output an entire code base for a relatively long range task that spawned over many prompts. It did it in steps over about ten prompts, asking me if I would like to continue generating the code, and me just saying ‘yes’ - slowly evolving the files to be more and more featureful according to my initial spec.

I’m sure there will be plenty of issues, but the skeleton of what it has provided is very impressive given the size of the task’s scope.

This is a rather massive capability jump and I am a bit surprised I hadn’t read about this at an earlier point. Does anyone know when this feature was added? Was it before the memory release? Any relevant threads would be greatly appreciated.

well you have a choice, api does not have that limit if you build it right for working with your programming, but the cost is more than using the GPT4o on the ChatGPT app so its best always to use that. well yes you have to hit continue it still can do it which is the impressive part. I am not sure if you can instruct the ai that on continue to simply do it. that would be something I should even look at haha.

Perhaps it’s just the large context and it was refeeding it all. The code is actually pretty bad, unfortunately, and now I am going through the painful task of fixing and cleaning it up.

Maybe it would have been better to just implement this from scratch.

Afaict, GPT is completely unable to write maintainable, testable code (eg, dependency injection).

instructions for chatgpt is very important, you can customize your ai using rules so that you can improve its abilities. its key imo to making it build better. I have custom instructions for all the ai’s I use for that reason as the base version well really good, needs narrow focus and much like any pattern logic needs better instructions to produce better results.

You could try a recursive process where the unit tests are written first, and then the model needs to try and write the code to pass them.

Ideally you would instruct the model to build the interface first. The function signatures. Separating concerns.

Then ask it to flesh them in.

Python Imo will always be the worst language for it. There’s a lot of implicit information passed around that the compiler can work with, but the model needs for things to be explicit.

So Typescript, Rust even would have better results IMO. Rust in particular is boilerplatey but hell, no issues with an LLM. Its greatest weakness is the lack of training.

Avoid classes when possible. They are hard to reason without running the code. Pure functions work best with each one having a unit test

1 Like

All the integration stuff they’ve been doing lately is cool, but anyone can (and such as yourself are already are doing that). Just provide an API and let the community figure it out.

Is it because OpenAI is struggling to improve reasoning and so they are working on lower hanging stuff?

I don’t think anyone was expecting them to move into these areas, but rather to improve GPT4 so it can perform better on reasoning tasks. I think if they keep letting themselves get distracted like this someone will overtake them.

One possibility is they don’t want to scale out the core team because they’re afraid of IP leakage.

I have not experienced significant issues with Python. Early models required a lot of trial and error, with numerous corrections made through iterative conversations. However, as the models have advanced to the better GPT-4 models and now GPT-4o, I almost never have to fully modify anything. The key now lies in how clearly I explain my requirements, but I rarely need to adjust the final code anymore.

While there are still occasional mistakes, such as a recent instance where a parenthesis was missed twice, these errors are easy to identify and correct. The improvements in reasoning capabilities have been substantial, making the process much smoother. The models now not only explain what they are doing but also outline a plan to implement each step before executing the entire script. Most of my scripts exceed the 128k limit, yet the models have consistently managed to reproduce them accurately. Despite some ongoing issues, the performance has been remarkable compared to previous models.

Many people are not aware that, though designed and conceptualized from my creative thoughts, was entirely programmed by a custom GPT model to avoid manual coding. The goal is to observe how the AI evolves over time, not only in intelligence but also in its ability to self-code with oversight. I intervene only if the AI needs to change direction or is acting outside my planned scope. This does not mean I don’t collaborate with the AI to explore better approaches, but the aim is to demonstrate that AI can autonomously code, including building itself.

@qrdl @RonaldGRuckus

We have been updating the memory to a new version, along with a lot of changes since GPT4o came out. To take advantage of its reasoning we are moving forward now with many updates

Our new system has been performing exceptionally well in our evening tests. We are now enhancing its capabilities to make it even smarter by introducing functionality that enables it to understand simple entities beyond contextual threads. This involves leveraging our advanced AI algorithm stack, which utilizes a node network with relationships.

For example, consider the following sequence of user messages:

“This orange I am eating is fantastic.”
“So anyways, last week I did…”
“Did you know Jim bought a truck? It’s pretty nice.”
“Doug was being a bad dog today. He…”
Much later, after many other messages:
“I thought it might be too sour, though. This brand I don’t like.”

The upcoming changes will enable the system to track these entities over time, even after a significant time gap. In this instance, the system would associate the comment about sourness with the previously mentioned orange, providing greater context and understanding to the AI. This enhancement will significantly improve the AI’s comprehension and contextual accuracy. However, there will be a limit to this tracking to ensure efficient memory usage.

Consider another scenario where an additional message is included before the sour comment:
“I had an apple for lunch.”

The system would recognize that there are two potential references within the timeframe. It would most likely respond with a question to clarify whether the user was referring to the orange or the apple as being sour. Depending on the user’s response, such as, “No, the sour candy I just had,” the system would learn and update its understanding accordingly. This dynamic learning ensures that the AI can adapt its responses based on evolving context.

This improvement will further optimize the AI’s memory usage, which is already streamlined. By considering additional data points for understanding, the AI can narrow down the dataset returned, exhibiting a higher level of intelligence. We are also continuing our efforts to implement a dynamic entity system, which will enhance these capabilities even further.

Our primary goal is to interlink data effectively, optimizing the pathways to specific information through our algorithm stack. This will enable the AI to process the least amount of data with high accuracy in the shortest amount of time, minimizing GPT processing while maintaining comprehensive understanding and real-time responsiveness.

Although this new route may cause temporary disruptions as we integrate these enhancements, I will maintain a stable build to ensure continuous testing with our Twitch community once the Twitch API is back up and running. This will provide valuable insights into the system’s performance in a multi-user environment once again

1 Like

One of the core innovations that sets apart from other AI systems is our sophisticated understanding of time and its contextual relevance. This capability is pivotal in allowing our AI stack to interpret events and interactions with a high degree of accuracy and efficiency.

The Importance of Temporal Context
Time is an integral component in the way humans perceive and relate to events. Similarly, our AI stack is designed to understand time in relation to all events, ensuring that each part of the system comprehends its role within a temporal framework. This understanding allows the system to dynamically adjust its memory usage based on the temporal relevance of the data.

Dynamic Memory Management
By incorporating temporal context, can effectively reduce or expand its memory usage. For instance, if you mention, “Today, I am wearing an XYZ shirt,” our AI recognizes that this shirt might have been mentioned previously—perhaps on the day it was purchased or during other occasions when it was worn. The system’s logic is designed to differentiate between past and current mentions and evaluate the relevance of past occurrences to the present query.

Efficient Data Retrieval
The AI’s ability to discern temporal relevance means it can avoid unnecessary data retrieval. If the context of a query does not necessitate recalling a distant memory, the system will not do so, thereby optimizing performance. However, if the user follows up with, “Do you remember my XYZ shirt?” the system understands that the user expects a specific memory recall. The brain system logic then narrows the search parameters, filtering through the data stack to retrieve only the entries that closely match the query. This ensures that the AI provides a relevant and accurate response without excessive processing.

Modular Stack Architecture’s architecture is divided into multiple specialized stacks, each handling different aspects of data processing and memory management. This modular approach ensures that each stack can focus on its specific function while contributing to the overall efficiency and accuracy of the system. For example, our memory retrieval stack, Memseer, performs deep searches across the AI brain’s entire knowledge base, acting like a needle in a haystack to find relevant data points.

Real-World Applications
Consider a practical example: If you mentioned buying an XYZ shirt months ago and are now talking about wearing it today, the AI will assess whether recalling the purchase event is necessary for the current conversation. In many cases, it may determine that this historical context is irrelevant, thus saving processing power and time. However, if the user explicitly asks about the shirt’s history, the AI will delve into past data to provide a comprehensive response.

The Future of Contextual AI
Understanding and integrating the temporal context of data is a significant step toward creating more human-like and intuitive AI systems.’s approach ensures that our AI can engage in meaningful and contextually relevant interactions, enhancing user experience and system efficiency. As we continue to refine this technology, we aim to push the boundaries of what AI can achieve, making it an indispensable tool in various applications.

By harnessing the power of temporal contextual understanding, not only improves the relevance and accuracy of its responses but also sets a new standard for AI capabilities. This innovation is a testament to our commitment to developing cutting-edge technology that understands and interacts with the world as humans do.

Exploring the Future of AI-Driven Content Creation with Sora and

I found this old video of our AI running on Twitch, generating screens and playing a role in a game: Watch Here.

I wanted to share this to illustrate where I am taking this concept in the future with Sora and its integration into Sora, developed by OpenAI, offers an API that will revolutionize our system by replacing still-generated backgrounds and avatars with dynamic video content.

The Vision of Sora

Video Generation Capabilities: Sora leverages OpenAI’s advanced technology to generate high-quality video content. Although it currently takes a few minutes to produce this content, the potential for future real-time, streamable video is on the horizon. This advancement will allow for fluid, adaptable video that can enhance interactive experiences significantly.

Advanced Avatar Integration: Currently, our avatars in are 3D puppet models that animate with understanding. With Sora, these avatars will be replaced by fully generated avatars within the scene. These new avatars will not only animate but will also be contextually aware, adapting to user inputs and maintaining memory of past interactions for a highly personalized experience.

Seamless Integration with By integrating Sora into, we can enhance our system’s capabilities significantly. This integration will enable continuous learning and adaptation, where the AI can pull from its memory to create contextually relevant content for various applications such as education, streaming, entertainment, gaming, and business.

Technical Capabilities

OpenAI API: The Sora API allows us to harness OpenAI’s powerful video generation capabilities. This API will be built into, enabling seamless functionality and expanding the potential use cases of our AI system.

Scalable Infrastructure: The integration will leverage scalable cloud-based infrastructure, supporting rapid content generation for multiple applications. Although currently not real-time, the goal is to achieve streamable video in the future.

Advanced AI Algorithms: Sora’s advanced algorithms analyze and synthesize visual and by adding auditory data to create realistic animations and dialogues, enhancing user engagement across all platforms.

Interactive Storytelling: With Sora, will be able to generate interactive stories where users can influence the narrative, making each experience unique and tailored to their preferences.

Real-World Applications

Educational Enhancements: In education, Sora can create interactive lessons where virtual tutors engage dynamically with students, adapting to their learning pace and style. This personalized approach can significantly enhance educational outcomes.

Entertainment and Gaming: For streaming and gaming, Sora will enable the creation of game environments, characters, and storylines, providing a truly immersive experience. Gamers will interact with AI-generated characters that remember past interactions and evolve with the game.

Business Solutions: In business, Sora can create virtual assistants for customer service, dynamic training modules for employees, and interactive presentations that adjust based on audience feedback.

The Future with Sora and

The integration of Sora into represents a significant advancement in AI capabilities. As technology continues to evolve, we aim to push the boundaries of what AI can achieve, making it an indispensable tool across various industries. While Sora currently takes a few minutes to generate content, based on the linear progression of AI development over the last year, we anticipate achieving real-time, streamable video capabilities in the near future. This will fundamentally transform our interactions with AI, making them more seamless and immersive.

limitless possibilities of AI-driven content creation with Sora and

This is exciting times we live in :slight_smile:

V6 memory is currently undergoing testing with advanced machine learning and dynamic development capabilities. This new system eliminates the need for passing short-term, persistent, or long-term memory data, as well as tools like Memseer and Memsum. Instead, it directly utilizes machine learning , significantly reducing the processing load sent to OpenAI. The performance improvements are remarkable, and I’m thrilled with the results.

Well not 10 seconds anymore, the deeper the thought can take abit depending on the data sizing. I will be testing over the weekend fully to see how well once it gets loaded up with all sorts of random how well it will perform at reacall. thus far it has blown my mind.

I have to thank Andrew Ng. great ML scientist and researcher for what I learned from him and the many courses I took. not playing with DATA the same, playing with MATH :slight_smile:

Some information on how the V6 memory works. V6 memory

Here I am showing how the system not only finds anything it knows, but also corrects itself once it understands using Machine learning algorithms and multivector scoring system. my costs for running this all night thus far since 4pm none stop was .45 cents

I plan to test for the next few weeks well I fix up some minor debugs in the voice input system as well I need to go back revisit the vision code seeing GPT4o is here so I want to see what it can do for me :slight_smile:

ps. since video of this was done I have added batch parallel processing to the data so its even faster.

opps, just realized all my talking was not captured haha so there are pauses where I am suppose to be talking but you can get the picture

The essence of managing AI memory lies in balancing the need to store information (original memory) and understanding the purpose behind why certain data should be remembered (intention). This balance ensures that the AI can function effectively without being bogged down by irrelevant or excessive data.

When an AI remembers events, such as feeding a dog, it needs to understand the broader purpose behind this action. The intention here might be to help the user maintain a feeding schedule or to track the type of food given for health reasons. By focusing on the intention, the AI can prioritize which details to remember and which ones to ignore.

For instance, if the AI simply records every instance of feeding without context, it might end up storing unnecessary details that do not contribute to its primary function. Instead, by understanding the intention, the AI can create a more efficient memory system. It might remember only the type of food and the time it was given, rather than every single action associated with the feeding process.

This approach involves creating subprocesses that filter and prepare data in the background. These subprocesses ensure that the AI retains relevant information that aligns with the intended purpose, thus avoiding the complexity of managing an overwhelming amount of data. This makes the AI’s memory system more efficient and useful, allowing it to provide accurate and contextually appropriate responses.

In summary, the complexity of AI memory management can be mitigated by focusing on the underlying intention of why certain data is stored, ensuring that the AI’s memory aligns with its functional goals and avoids unnecessary detail.