Create an AI companion with long term memory

joan · February 15, 2023, 7:10pm

Hi there,

I would like to create an “AI Companion” SaaS. It should work like ChatGPT, using the GPT-3 API, but it should have long term memory, so it can remember all the past conversations.

In order to achieve that, I guess (not sure) we should store all the data from the conversations, reprocess it, and store it in a way that when the prompt ask for something we talked about, let’s say, 6 months ago, it could look for it in the DB and output accordingly. All seamlessly in the background, without the user having to do anything about it. I tried several SaaS, but none of them works like that.

Now, I’m in the marketing industry, I could help with the marketing, the branding and the investment, but I need a developer partner.

Is somebody is interested, let me know!

Best,

aissabenfodda · February 15, 2023, 8:22pm

Please correct me if I am mistaken, but it appears that this will require a significant number of tokens. It is worth mentioning that ChatGPT has a token limit, which may restrict the feasibility of this approach.

aissabenfodda · February 15, 2023, 11:14pm

Is there a way to send private message. I would to discuss further with your on private. Thanks

humphree · February 16, 2023, 12:18am

You could store the entirety of the history of conversation with the bot locally, yes. You could use some form of classification to retrieve relevant parts of conversation you’ve had in the past and pre-process it before sending it off to the API for response generation. This would reduce the amount of tokens required I would think, hypothetically speaking of course.

wfhbrian · February 16, 2023, 12:20am

In case you’re wondering how this can be done

First, store the message data using the Embeddings API to represent each message as a vector that makes it easy to find relevant content in the future. The data could be embedded in various different ways, mostly varying how many messages (back-and-forth between the user and AI) are stored per embedding.

Hypothetical document generation (HyDE) could be used to look up the relevant context for future responses. Basically, you instruct GPT-3 to generate hypothetical messages that would be relevant to your conversation if they had happened. Then you use those hypothetical messages to find the most similar content in the embeddings system.

Lastly, you feed the responses from the embedding system into a prompt with the original message from the user to generate a response using context retrieved via HyDE.

raymonddavey · February 16, 2023, 1:22am

In your HyDE model, you have a step where the LLM answers the question and then this is used for the search

If you have a corpus of information, are you proposing that GPT poses the question and uses GPT’s general knowledge to answer. And then you embed that to do a search over your corpus. Then do you ask the same question - but this time you use the context from your corpus?

Related question: When it comes to chat, is the “corpus” the chat history?

What if GPT doesn’t have any knowledge of the question. Eg it is about something GPT has never heard of.

Are you able to explain the four steps of HyDE for me (in relation to a chat bot with history)

humphree · February 16, 2023, 1:25am

This would be beneficial for a single user using the model, but in an instance for a multi-user application that would require a separate embedding model for each user, no?

wfhbrian · February 16, 2023, 1:37am

Yes, I would keep it simple that way.

The LLM not knowing shouldn’t be an issue due to hallucinating, and the hallucinations are harmless since they’re filtered by the embeddings system. The bigger issue would be the lack of knowledge in the embeddings corpus.

Here is the original paper. It doesn’t specifically reference chat logs, but the lookup method is the same. Get the LLM to generate what’s likely to be closest in content and form to the embedded corpus.

Then you continue the conversation by prompting the LLM to create a response but this time with the context returned by the embeddings system included within the prompt.

raymonddavey · February 16, 2023, 1:38am

You would need to keep a log for each user. And they would have their own embeddings

wfhbrian · February 16, 2023, 1:40am

It depends on the application. I can imagine good reasons for both combining and keeping them separate. And a complex system may even make use of both.

joan · February 16, 2023, 4:27am

We wouldn’t be sending all the info every time, just the needed data. Currently text-davinci-003 has a 4,000 tokens limit, about 16.000 characters (prompt + completion) it should be enough.

joan · February 16, 2023, 4:31am

Sure, at joan@boluda.com

joan · February 16, 2023, 4:38am

Locally or in the cloud. The key is not sending all the data every time to OpenAI, just the “memories” needed. Maybe first we should search the DB, so we could first searrch retrieve the needed memories using an app like https://vectara.com/ or similar.

joan · February 16, 2023, 4:45am

Yes, that’s the idea I have in mind. Having a log for each user. Or as @wfhbrian propose, even combining them. But not for the MVP, tough.

georgei · February 16, 2023, 5:22am

Hey @joan, your idea sounds exciting.
Here’s why.

I’m building my own product which has a little to do with yours.
But inevitably the idea of users having their own companion came to my mind as well, but I left it for future, if I’ll ever have time and resources for it.

The AI companion could be more.
In the past I’ve built some small robots - on wheels though
I could see in the future the possibility to give ChatGPT-like knowledge to a small robot.
Of course that the memory would be critical for such companion.

From the technical point of view I know the solution and I can put it into practice.
However, it requires a lot of testing. It’s a long term project, if you ask me.
Right now, if completed, would be also limited by the technology progress and costs.
But it is doable, definitely.

krisu.virtanen · February 18, 2023, 8:07pm

Well, i have an idea based on idea i have wondered about…
It is not to give it a long term history, but to give it parameters to work with. After that if done properly, AI does not need long term memory and just decent amount of centext is enough.

Naturally this is up to what you actually want. It still cant remember conversations beyond man can go, but by paremeters it can adjust itself to behave in certain maners.

joan · February 19, 2023, 5:08am

I would like to add all our conversations to the corpus, so it can remember those too. Not sending every time all the details in the prompt.

eduard1 · February 21, 2023, 11:29am

@joan Hey we are trying to build that maybe we should talk we are quite far into the project!

joan · February 21, 2023, 12:04pm

Sure! Send me an email at joan@boluda.com and tell me about it!

tutulkumar · February 21, 2023, 5:32pm

Nice thinking and try continues with other,best wishes for you.

Topic		Replies	Views
Difference of GPT's and Assistants Community custom-gpt , assistants , assistants-api , chatgpt-gpt , tp-1	25	16681	February 6, 2024
AI-custom friendship bot Community	7	1663	December 15, 2023
New to the developer world. Eager to help and learn! Community looking-for-teammate	22	2582	July 23, 2023
So Muse - Embodied Personal AI that inspires Community	5	826	December 15, 2023
GPT-3 book writers - want to collaborate on podcasts or something? Community	37	1842	January 21, 2024

Create an AI companion with long term memory

Related topics