It depends on how much historic “context” you require.
I asked Chat GPT and it responded with what we have suspected. We must concatenate previous responses in subsequent requests.
As you mentioned, OpenAI’s GPT-3 API does not currently support sessions, so it cannot maintain state or context between API calls. To maintain historical context in repeat API calls, you can include a summary of previous interactions as context in your subsequent API calls. This can be done by concatenating all the previous outputs and using them as the “prompt” in your next API call.
For example, in your code, you could create a variable to store the conversation history, and concatenate the output of each API call to that variable before making the next API call:
conversation_history = “”
response = openai.Completion.create(
engine=“text-davinci-003”,
prompt=“Tell me a joke?”,
temperature=0.7,
max_tokens=1000,
top_p=1.0,
frequency_penalty=0.0,
presence_penalty=0.0
)
conversation_history += response[‘choices’][0][‘text’]
response = openai.Completion.create(
engine=“text-davinci-003”,
prompt="What was the last question? " + conversation_history,
temperature=0.7,
max_tokens=4000,
top_p=1.0,
frequency_penalty=0.0,
presence_penalty=0.0
)
print(response[‘choices’][0][‘text’])
prediction_table.add_data(gpt_prompt,response[‘choices’][0][‘text’])
In this example, the variable “conversation_history” stores the previous output and is concatenated to the prompt in the next API call to maintain the historical context of the conversation."
Alternatively, embedding is an option but requires a backend server/services such as Redis or Pinecone.