How do you maintain historical context in repeat API calls?

Each time I make a call to the API it starts off with no prior context, unlike the chat.openai.com scenario. Is there a way to maintain state of the model during a session?

response = openai.Completion.create(
  engine="text-davinci-003",
  prompt="Tell me a joke?",
  temperature=0.7,
  max_tokens=1000,
  top_p=1.0,
  frequency_penalty=0.0,
  presence_penalty=0.0
)

Output:
Q: Why did the mushroom go to the party?
A: Because he was a fungi!

response = openai.Completion.create(
  engine="text-davinci-003",
  prompt="What was the last question?",
  temperature=0.7,
  max_tokens=4000,
  top_p=1.0,
  frequency_penalty=0.0,
  presence_penalty=0.0
)

print(response['choices'][0]['text'])
prediction_table.add_data(gpt_prompt,response['choices'][0]['text'])

Output:
Unfortunately, we do not have access to the original question.

11 Likes

How I would do it is to create a dataset and add each prompt and output on top of each other:

Session1:
Prompt1: text
GPT generates output

Session2:
Prompt1: text
Output1: text
Prompt2: text
GPT generates output

and so on…However you will need to keep the amount of tokens in consideration. You can’t keep adding unlimited prompts and outputs. So then you could create some code to forget the oldest lines, once you’re almost out of tokens.

That’s how I would do it, but maybe there are better methods.

2 Likes

This is a little bit constrained though as we are limited on number of tokens for the model, so increasing the prompt length across responses quickly would hit this limit?

Would love to know if we can replicate the experience of the chat.openai.com interface via the API?

5 Likes

You need to include the previous interactions with your prompt.

But you are correct that you will eventually run into a token limit.

I have written a blog post about giving chatbots recent and long term memory. I’ve tried to step through the process in easy to follow steps.

https://thoughtblogger.com/continuing-a-conversation-with-a-chatbot-using-gpt/

7 Likes

Thanks!

I’m assuming they will release an API that allows you to maintain state during a session soon, but I appreciate the reference.

4 Likes

That would be nice, and save bandwidth.

If a dev passes by here id suggest adding a setting to the api call that lets us set a number of max tokens to store on the backend for historical context. This can be associated with a sessionId, and a max-age variable, after which it is automatically deleted.

Would also need a list/delete sessions api call, if it’s possibile to use a max-age of 0

3 Likes

That’s weird because I asked ChatGPT about how to make a request with context of the previous response since I’m making a chatbot front end for text completions, and it mentioned that you can add a context value to the request. I haven’t gotten to try it out yet, but I wouldn’t be surprised if it would get something wrong, considering it only has info about 2021 and prior.

2 Likes

I had the same issue and implemented a similar strategy. However, I’m only saving and including the most recent response from Chat GPT in my next session rather than the entire conversation. this isn’t perfect but it does a decent job of knowing what you are talking about with just this and it saves tokens. So for example if you say “Tell me the best time to visit Florida” and it answers. then you say “How about Maine?” it will know you are talking about the best time to visit because it is in the previous response.

I’m hoping for a better solution soon because the token cost will add up quickly.

2 Likes

In my use case, I have to provide quite long texts upfront to the bot so it can learn from context and reply the user properly. So, after the first interaction, I can’t resend all those tokens, because It has already exceeded the 4,000 tokens limit to the model. I’m stuck on this issue.

1 Like

rn with ChatGPT APIs not supporting any form of “sessions”, I was forced to send some “context” on every query. However, LangChain has some nice support for summarizing prior prompts - ConversationSummaryBufferMemory
So, you don’t have to struggle to prune your prior Qs size to be less than 2k or 4k or … Your summary can be within this limit. And LangChain ends up summarizing conversations & sending these summaries as “context”.
Just another mechanism which may help you.

2 Likes

Stand-by for the ChatGPT API.

My best guess is that the ChatGPT API will offer session management.

HTH

2 Likes

Funny, ChatGPT gave me the same advice. It does not work as the API only allows the documented properties in a request object.

2 Likes

Thanks. It’s inevitable I guess. Many are brought to this service because of ChatGPT, only to realize that the API seems not to be as intelligent as the chat itself, the absence of a session being probably one of the reasons

3 Likes

It depends on how much historic “context” you require.

I asked Chat GPT and it responded with what we have suspected. We must concatenate previous responses in subsequent requests.

As you mentioned, OpenAI’s GPT-3 API does not currently support sessions, so it cannot maintain state or context between API calls. To maintain historical context in repeat API calls, you can include a summary of previous interactions as context in your subsequent API calls. This can be done by concatenating all the previous outputs and using them as the “prompt” in your next API call.
For example, in your code, you could create a variable to store the conversation history, and concatenate the output of each API call to that variable before making the next API call:

conversation_history = “”
response = openai.Completion.create(
engine=“text-davinci-003”,
prompt=“Tell me a joke?”,
temperature=0.7,
max_tokens=1000,
top_p=1.0,
frequency_penalty=0.0,
presence_penalty=0.0
)
conversation_history += response[‘choices’][0][‘text’]
response = openai.Completion.create(
engine=“text-davinci-003”,
prompt="What was the last question? " + conversation_history,
temperature=0.7,
max_tokens=4000,
top_p=1.0,
frequency_penalty=0.0,
presence_penalty=0.0
)
print(response[‘choices’][0][‘text’])
prediction_table.add_data(gpt_prompt,response[‘choices’][0][‘text’])

In this example, the variable “conversation_history” stores the previous output and is concatenated to the prompt in the next API call to maintain the historical context of the conversation."

Alternatively, embedding is an option but requires a backend server/services such as Redis or Pinecone.

1 Like

To original question of maintaining historical context/session, I’m not sure if ChatGPT is using some form of server side (or api level) session management yet!

Looking at the network payload, ChatGPT client seems to be sending the previous interactions as part of new question/payload.

So the solution as previously pointed out (for now) would be to prepend the entire conversation set when making the new one. (at the cost of tokens).

1 Like

Same here, ChatGPT suggests using the property “state” but this its not allowed

1 Like

This is just ChatGPT hallucinating information I’m afraid.

3 Likes

Hi @adriaanbalt

Just to correct your statement so we are technically accurate, Redis or Pinecode are not required for the tasks described and can easily be accomplished using most any SQL database.

Redis is useful, and very helpful, but is not a absolute requirement as you mentioned.

It’s probably best to filter out words and chars which have little information content (value) to save a few tokens here and there.

LOL. You should be careful referencing an ChatGPT technical guidance. ChatGPT is a type of text prediction, auto-completion engine AI and not an expert system AI. ChatGPT more-often-than-not will “cobble up something” which is not fully accurate to generate a completion.

Oh, when will they ever learn ? Oh, when will they ever learn ? - Peter, Paul and Mary (1955)

Note:

OpenAI API is just an API which provides access to the OpenAI API endpoints. It’s not a “full blown chat bot application” so to maintain historical context you should use a database to store the prompt and completions. How you implement your database, filter and summarize prompts and completions, and feed historical information back into an API completion call will depend on your use case and requirements.

HTH

1 Like

How can we do this? Can you give some examples?

2 Likes

You can literally ask chatgpt to create a proxy API in php to write the json node to a file and append it within the prompt. It requires a proxy API page that takes the data and forwards it along. Took me 4 hours to make, and my not even thinks it has a brain. Did the same with the new model in 30 min. I’ll post my code to git

2 Likes