I am thinking about building a debate moderator.
I notice in the Playground, when I ask a new question, the AI answers as if it remembers the previous questions that I asked.
In the context of a debate moderator where the context of the debate keeps getting updated as participants contribute to the conversation, how do I keep the thread going?
Is there some kind of session ID parameter when I call the API or is it something to do with files?
Any help is much appreciated! Thanks!
I suspected this was the case…
Thanks for the answer, happy holidays!
one idea might be to define a certain point in length (say after 20 interactions), and do a different call to have the AI summarize the conversation so far. Then you use that short summary has part of the prompt for a brand new conversation. It’s a way to keep context and try to keep the token count lower.
Haven’t done it, though
This is actually a weakness of these NLP models, they’re good in short burst but start to fall off the longer things go on. You basically have to keep reminding it of all the previous conversations in order to keep it in the loop as the conversation goes on.
As @m-a.schenk and @nunodonato said, it’s costly because the prompt will be super long but at the same time summarizing what was said adds an additional call to the API. So 1st call will summarize the previous conversation and the 2nd call will give you the output and who knows how long that will work but it’s worth giving it a try.
I saw a video of a guy that did a live chat with multiple people and they all spoke to the bot through voice and it was able to remember people’s names and continue the conversation. The only theory people were able to come up with was the summarization of the conversations.
Hello Joett! Thanks for the valuable input!
I have been thinking about ways to minimize prompt length and API calls too…
I will probably have to do more stuff on the client side like somehow tagging at which point the conversation starts to branch into different topics/parallel conversations.
Another way may be to rely more on the users to detect branching/fallacies so they are the ones that request the API calls.
Hopefully as the cost goes down it will become feasible to call the API every time an input is detected then the application can monitor the conversation in real time…
Thanks for introducing to me the concept of embeddings, @m-a.schenk! Now there’s much readings to do
Context with a capital “C”, huh? A philosophical can of worms for dinner!
Here is an idea that I have tried.
You can see the full code in my git repository here.sample chat conversion notebook.