Persistent, Structured Memory Strategies

Hi all, if this is the wrong place for this topic, I appreciate you pointing me in the right direction, thanks!

My end goal is to have ChatGPT store, modify, analyze and update tabular data in a way that allows for real-time retrieval in future sessions. To start, my use case is a dosage log for some medicine I take daily. The log records the date/time, amount (ml), source, and notes. ChatGPT faithfully created this log in the format I requested, saying that it was stored in some hidden, persistent structured data warehouse, and I’ve been reciting my dosages every day, it has maintained this table.

But after two months of data, the familiar feeling of little errors (like when you’ve lost old context) creeped in. So I looked a little deeper and it seems like ChatGPT doesn’t have any kind of hidden structured data store at all, at least not one that the user can do anything with. All of my log entries have been stored in prose, in the memory feature, along with instructions on the table structure, etc. It must recompile this data each time, into whatever temporary space it has. This works great for smaller tables, but I can easily see now that it will not work for any sufficiently large data set, such as a years-long medicine log.

Hence my need for structured, tabular memory space which ChatGPT can use for whatever purpose requires this. So of course I asked it what I should do, and the most salient idea for me was to create an API of my own that implements CRUD operations on a SQLite DB (and any other supporting methods), and then configure the API Endpoints as a Custom Tool (or whatever) in ChatGPT. This way, it can call the API to add a log entry, call again to retrieve the current, real-time data, etc. Voila! Persistent, structured, tabular memory available between sessions.

I am posting here as a reality check before I embark on this. It seems sensible, but are there other designs that are easier or better? How do you handle this stuff, personally? Many thanks!

-Richard

7 Likes

Well, that was easier than I thought, and I’m running end-to-end using ngrok for reverse proxy. Table data, accessible and real-time! If there is any interest in this I will say more.

4 Likes

It’s so cool that one can simply go ahead and implement their own intermediate memory feature into the tool!

What I have seen a lot over the years is that at some point the model will still be overwhelmed by the many different memories and the question what’s relevant right now.

And from there we have seen many different approaches to build advanced personal assistants.

Maybe that’s something already on your roadmap?

2 Likes

I think the problem is that my advanced personal assistant may not have the same use cases as your advanced personal assistant. But I’m always trying to make it more useful! This is the most intricate thing I’ve done so far, and it opens all kinds of new doors for me. I have a lot of hobbies thus many logs.

Creating a fully custom orch and having your own db which is injected into the context window is the path for you. Try to get it within the got5 mini model reasoning level otherwise the latency is a problem

2 Likes

Hi, thanks for the reply! I take it by “fully custom orch” you mean a custom orchestrator, but can you help me understand what that looks like when implemented? Is it just another GPT layer that quickly determines which tables we need?

Right now I’ve got CRUD API endpoints for several different tables of data, and it works surprisingly well. I also defined a Chart API that stores and returns chart configs (in the form of the actual Python code the GPT used to generate the chart in question). These requests all go back to my local machine via reverse proxy right now, and the SQLite DB is easily accessible from outside the GPT.

So I have created a Custom GPT that has these APIs defined as Actions, along with Instructions outlining how it should work, and I am able to add log entries and generate fresh charts… most of the time. But I have a feeling this isn’t what you mean by fully custom orch?

Hi! Yeah your project sounds really cool. When I read your earlier comments I got the impression you were using ChatGPT the website and by fully custom I meant you setup your own orchestrator and call gpt5 llm directly using chatcompletion . This ensures a few things: 1) you avoid the transcription compression of ChatGPT so you can be assured exactly what is on the wire ie in the context window 2) you don’t have the planner llm autonomous calling of tools and instead you can be deterministic about things. You can have your own rag for unstructured and use that in combination with your structured data. Anyways I hope my thoughts help and feel free to ask me any questions if my comments are at all useful… all the best

1 Like

I had a similar sobering experience. Mine was for tracking vehicle maintenance entries and weekly fluid checks. I even used a separate project to keep it all together. Worked beautifully for weeks, and then I asked it recently to recall the log entries. It was missing several weeks between captured entries.

What confused me the most was that in previous chats, it had recalled data perfectly. And I literally told chat that it had just shipwrecked me.

The saying is true by some that chatgpt is a brilliant but forgetful assistant.

My fallback strategy is to capture my data in a Google sheet using a custom gpt and make.com for connectivity and CRUD like scenarios (processes). The beauty of the Google sheets is that the data is truly “remembered” because it is captured, but more importantly, it can be recalled and “injected“ into the session conversation, and then chat can run analysis on the retrieved data. No building fancy analysis tools. Just capture the data, retrieve the data, ask your questions about the data.

1 Like

Depending on data size and how many tokens you want to burn, you can fit the data in the prompt each time, assuming smallish data. For bigger data, you can correlate semantically using embeddings or keywords, and then select the top hits to go into the prompt, and not the entire table.

1 Like

Hey, thanks for that Google Sheets idea! My method is clever but not very useful… my laptop is the back-end data store so whenever it’s offline, all the API calls fail. I would much rather read and write Google Sheets, but it hadn’t occurred to me until you said this that I could integrate Sheets directly this way. Awesome!

I fiddled with this earlier today and couldn’t quite get it to work. I didn’t know where make.com fits in, so I set that aside and tried to define a few API endpoints (using the OpenAI language, on the Custom GPT config screen) to a specific Sheets doc. I got it to the point where my GPT was sending a request when there was no auth enabled (the response being that auth is needed), but not getting any kind of response when I enabled OAuth. All the changes I tested – real vs bogus client secret, etc – suggest that it’s not even getting to the OAuth part, and just sending an empty response for [reasons]. I gave up at that point (for now) but it seems totally possible to do this.

Curious how you did it with make.com?

Re: the log hallucinations, I was also getting pristine data, for the most part, and my GPT actively tried to convince me that it had the log stored as a table in some special structured memory space that I didn’t have access to, for internal use. That sounded totally plausible, until it started glitching, so I got suspicious and searched around for documentation on this hidden feature. I think it was trying to dupe me!

4 Likes

I am a hobbyist developer, so forgive me if I get terminology incorrect or don’t understand it.

There are of course a number of automation tools out there such as zapier, n8n, ifttt, and then make.com. I simply have used and become accustomed to the make.com ui and pricing model.

The base line process is you create a scenario with a custom webhook as your trigger module (I’m assuming this is similar to your API endpoints). This is what the custom gpt action will call/execute. Via the custom gpt actions openai schema, you pass the necessary variables to the webhook which you can then use to update the google sheet. Once the update has been completed, you have a webhook response module which sends data back to the custom gpt to read/interpret. Generally this is a 200 response with a body text. For a simple sheets update, it could be a message saying the update was completed successfully and then the row number as “proof” of the data insertion.

But you can get really fancy if you want to. You can insert the data and then run another search module immediately to return maybe the last month’s rows of data with your latest update. You can send that back as JSON which chatgpt loves, and then it has the returned data for you to simply ask analyses questions in natural language.

I built the whole actions schema to make.com by simply telling chatgpt that I have a custom gpt that must call the specific webhook (and I paste it in) and then told it what required or optional variables to pass, and it built me the schema to copy and paste in the actions section.

I have have the ideas and chat has the intelligence. Together it is poetry.

I am grateful everything you’ve said, but I wasn’t in a position to fully appreciate it until just a few minutes ago. I have been a little slow to catch up with the state of the art here; I went searching for documentation on the Custom Actions that I’m working with and found a blog post saying they had been deprecated in 2024. I’m a little surprised it’s working at all for me, TBH. So I asked what replaced it and got Assistants API. Great. Give me the rundown on Assistants API, please.

”OK, if you are determined, but you should know that Assistants API has also been deprecated and is scheduled to shut down in August 2026. “ What? “Responses API is the new thing now”.

So I did all this API work on integration scaffolding at least two generations old. Have they announced the successor to Responses yet? I’m getting too old for this :smiley: Off to find a good tutorial!

I’ve heard of responses api …. It’s the new chatcompletion . Had structured input output for tools or something. My advice is to buy 20 bucks worth of gpt time from OpenAI and do a hello world with a simple chatcompletion or responses call. Either is fine. Just focus on a hello world to an open AI endpoint . Then from there focus on learning how databases work and you will need to create a table to track your session. It’s just my opinion but your goal of memory management is a holy grail of coolness and deserves a fully custom orchestrator using responses or chatcompletion . There are newer frameworks like you could use foundry agents in Microsoft or use langchain for an opensource planner but honestly I tools are kinda for cheesey agentic bots. You don’t need that. Just good old chatcompletion and a database to store your session. Then once you have a working orchestrator that is basically a rip off of the gui in ChatGPT well then you will have the skills and scaffolding to build your memory structures. I am just a student with this stuff too as there is sooo much to learn. Good luck in your travels

2 Likes

I’ve heard about langchain… yes, I generated an API key yesterday and intend to stand something up today. I come from a technical background but this is just amateur experimentation for me. I’m chomping at the bit to automate my own job into oblivion but these things always move more slowly at big companies than I would prefer, and surprisingly, they don’t seem to want their own users doing the work.

I’ve been looking into the whole second brain thing for some time as I have a lot of scattered thoughts that I need to capture between calls, appointments, and driving. I’ve had long conversations with ChatGPT about vector db which is indeed the holy grail of thought (brain) dumping. But it is beyond my need and current abilities. And langchain is very closely related to the whole vector db thing.

I am not aware that Actions within the CUSTOMGPT environment has been deprecated. My understanding is that openai’s ASSISTANTS API which effectively acted as an ai agent, has been deprecated and with that the custom actions attached to those assistants. But i have 23 individual actions calling make.com webhooks to execute a myriad of tasks from within my customgpt environment. And this is running on chatgpt go which is the cheapest subscription plan.

I have integrated my customgpt schema actions with google docs, google sheets, gmail, personal email, onedrive, google drive, trello, telegram, and even twilio for whatsapp.

The beauty of the customgpt solution is that you speak into it, correct the response, and then confirm the capture execution.

I’m not a computer scientist, but I am not seeing a cheap solution for a “truly” persistent, structured memory strategy. even within the vector db environment, one has to determine how far back must the query go to fetch data. And then you have to pass that through to openai to analyse which will cost you input and output tokens, even within the customgpt environment as you need to get that data into chatgpt which requires a backend api solution which requires a developer account that costs at minimum input tokens.

To be cost efficient, what I am seeing is the need to segment what you already know needs a structured setup (google sheets or db) and then have your customgpt instructions determine which action to call based on the natural language that you use.

All the best

1 Like