Short lived memory for chatbot

yemane · August 22, 2023, 2:47pm

Currently have:

an aws lambda function that receives a question from a user along with some relevant document.
lambda makes a request to openai api with a prompt that uses their question and their doucment as the context

User requests:

{ "question"  : "how is john feeling", "context": "the employees john...."}

Prompt

Answer the question {question} given the relevant context.

context ###
{context}

What I’m thinking

The relevant context is retrieved from an sql database
I’d like to cache the context so to not recreate it each time
Ideally some pattern of associating a session id with each user and context that expires within 15 mins
when the session is expired, the data would be deleted from the cache
if the session id is not expired, lambda gets context from the cache. Otherwise, it is retrieved from SQL and put in the cache for future

I’m wondering if this pattern is recommended. And if there are any solutions similar to it. Or if there are any suggestions on going down this line of thought.

EpicPharaoh · August 22, 2023, 3:35pm

I don’t know if this pattern is ‘recommended’, I imagine it would depend on your use case but I don’t see anything particularly wrong with it.

As far as suggestions, I highly recommend you check out the ChatGPT retrieval plugin (its on the openai github under chatgpt-retrieval-plugin), it should have some functions that are helpful for what you’re trying to do.

The goal in my experience when prompting ChatGPT (outside of the explicit goal) is either to reduce the number of tokens in the message/reply, or to use cheaper models to their full extent (i.e., API calls to ADA are 1/10th the price of calls to Davinci). Here are several methods I know of that achieve these goals.

First, if the document can be broken into sections then you can have a cheap model (like ADA) select a section for you, and then you can pass the relevant sections to a more capable model for answering the question.

Second, if you have a structured question with predictable answers (yes/no, or ranking something) you can use (LMQL is useful for the following) embeddings, flow control, and constraints to have a cheaper model behave in complex and consistent ways.

Finally, if you are going to have a conversational chatbot, you will need a way to send the message history to ChatGPT. One way to use less tokens in a chat like this is to send the chat history to ChatGPT and have it summarize the entire chat every ‘X’ messages. ‘X’ will be different based on API token use over time in a conversation; conversations with more context being added consistently will need a lot of summarization, whereas conversations taking place within ChatGPT’s context window will not need any summarization at all.

I hope I was able to be of some help, and good luck with your project

Topic		Replies	Views
Caching system prompt to facilitate interaction between user and llm API gpt-4	3	1902	September 19, 2024
Context aware context for follow-up question Prompting embeddings , gpt-4 , api	13	8208	October 16, 2024
Getting ChatGPT to Remember Previous Chat Messages Prompting	37	67311	January 29, 2024
Best prompt engineering to simulate the remembrance of the conversation Prompting	2	2491	December 19, 2023
Strategies to feed GPT4 bot with product database to maintain context API	13	1108	December 17, 2023

Short lived memory for chatbot

What I’m thinking

Related topics