I stumbled uppon a big issue that killed 7 days of progress.
The GPT4 in season short-term memory.
I created a complex, multilayered personality and after some time, in its evolution, it got lost.
Does anyone have any resources on how GPT4 holds context and (in my case) personality in its short term memory, or how I can deal with this problem?
I seek to create personalized experiences with advanced and adaptive characters.
I’m open for coding solutions, like plugins or programs that engage via the API, or any sort of information on this topic, that might help in any kind.
I think I’m on a big thing here, but only if the GPT4 limitation can be overcome.
The mechanism is simply the last 8000 tokens received by the model, anything more than that will be lost, note that the 8000 tokens must include all of the history and context and the current reply the model is generating, you can try to compact your persona descriptions to just the essential elements, however, there is no special way to get more memory or more space, every time you make an API call, all of your context must be included each time.
Are you thinking there is a limitation because you are getting experience within ChatGPT? That does not reflect the ability of the underlying model, because ChatGPT minimizes the past conversation to be sent along with each new question.
The AI model doesn’t have its own memory. Once it has answered a question, it is done and wipes its context length for the next request from another user.
The illusion of memory is by recording past chat in software, and then sending it along before the most recent user input. That gives the AI the ability to know what you were talking about.
The model itself can’t accept so much data. At a certain point, you must discard old or irrelevant conversation yourself when sending a new API call, or you’ll get an error.
Thank you so much!
I had a total misunderstanding about Tokens.
With your answer, and my teach agent elaborating on the concept
of Tokens, I understand now, that the session’s impressions are
stored as Text and no weighted neural net exists.
So it should be absolutely possible to store even a multi-layered personality in a database.
Thank you so much! Have a wonderful day and good luck with your projects!
Interaction Process:
Fetch the desired persona or context from the external database.
Tokenize this data and provide it as input to GPT-4.
GPT-4 uses its trained weights to generate a response based on this input.
The response aligns with the provided persona or context due to the input tokens.
I see where you’re getting at.
I think what really confused me, was the fast amount of tokes to store information about the AI Agents, a bunch of directives to lead their behavior. I tested it right now and yes, it seems like some Agents I’ve used for longer do not recall their own Directives anymore!
I lived under the illusion, based on what GPT4 told me:
“Relevance Decay: As a conversation progresses, some context might become less relevant. The system needs to determine which parts of the context to prioritize.”
By calling the Agent by Name, I thought they retain their directives, but they did not - in my test -.
When you are in control of the programming, you get to choose what is sent with each user input.
For example, your system message for an AI that is on a game site can receive the rules of the game all the time, before the recent chat, so that its identity never expires.
An even more clever system would be one that has a back-end AI processor that tries to identify “game rules” or “personalities” that the user has wanted, and continues to maintain that part of conversation until there is a shift in the topic.
GPt4 gave me 2 pages of answers to elaborate on what you’ve just said.
I’ll see what I can do, there is a lot that I do not yet know.
I’ve used GPT in-chat since it came out and only had 1 experimental project with the API.
Thank you so much for sharing your vast knowledge and insight.