I am fascinated about the memory option and I would like to experiment with that by using the API, is there any I can turn on and off memory?
The API currently does not offer a memory function. It’s only currently being rolled out across the ChatGPT interface.
But if you are interested in building a custom solution for retaining memory then it might be worth checking out some of the other Forum posts. Some individuals such as @darcschnider have been doing some advanced work to enable memory retention. See the post here:
Pretty much have to build your own memory design. They do have some sort of test going on with chatgpt with memory for some but I don’t believe that is coming to API. If you want some insights on where to start I can help.
Does anyone have more information about this?
Seems like OpenAI has added some significant intelligence not exposed to the API.
I do think ChatGPT is more intelligent and API lags behind significantly. Could be performance considerations or they want to have users use chatGPT more or a trick to keep their core capability one generation ahead (If API === ChatGPT, other GPTs can learn)
I would be surprised if that was the case, tbh. I think the FTC might have an issue with it as well as it would be a deceptive biz practice (bait and switch), unless stated somewhere in the ToS.
But the memory is definitely something not in the API.
ChatGPT uses a small, persistent memory design that you can control. It captures key information based on your interactions or specific projects you mention. Additionally, it includes pre-configured catches for people and preferences. Similar to Kruel AI, it operates through an API-driven memory system, utilizing a memory store and logic to build AI responses. This ensures the system consistently works with accurate data and leverages stacked logic to maintain accuracy.
I wonder what was so horrible it had to be hidden up there.
Haha yeah I am surprised too. I mentioned two frameworks that are looking into different ways of doing memory, and then it was marked as ads. I edited but it doesn’t seem to help.
Any updates on this topic? I’ve been getting exactly the same responses when I pass slightly similar prompts. If this is not because of the use of memory, what could it be?
Has anyone got a good example of implementing this? Getting the model to submit items work remembering to a tool seems easy enough, but how about retrieving them?
Are you embedding them into vectors that the model can query if it things they’re relevant, or just dumping them into an initial system prompt?
Given there’s no way for the model to know what has been remembered, I guess it all has to go into the prompt, but then how do you prune what is still worth remembering to stop it getting too big? Ask the model to do that each time?
Welcome to the community!
One thing you can do nowadays is to use CoT to “discover” memories. If you can get the model to behave as such in a mode A:
“<reasoning>Hmm let's see. To deal with this, I would first need to know about <x>. <INQ>”
stop sequence: <INQ>
You embed that statement, and paste the results into the context somewhere, then continue with mode B
“I know that...|I don't know”(continue reasoning)"<\reasoning><response>{response}<\response>
…
Remember that you have full control over all of the model’s context if you use a completion model. Anything you add, you can also take away again.
On the next query, you simply don’t include the the previously retrieved messages again.
So in your code you’d have:
const conversation: Message[]
const pre_instructions: Message[]
const post_instructions_A: Message[] // To deal with this, I would...
const post_instructions_B: Message[] // I know that...
const retrieved: Fact[] // in mode A, this would be empty, in mode B this would be filled
const augmented_conversation: Message[] = [
...pre_instructions,
...retrieved,
...conversation,
...post_instructions_A|post_instructions_B // depending on your current mode
]
const stop_sequences: ["<INQ>"]
then you’d just send augmented_conversation against the endpoint.
then you could have an optional clean_conversation function that replaces the last assistant message with just its <response> content to compact information further.
Of course, you can also use JSON if that’s more convenient for you.