I hope everyone is doing great. We are currently trying to develop an automated ordering chatbot with open AI APIs. We are using JAVA libraries for the development. We are facing a weird issue. We need our chatbot to be context-aware. However, we are calling API every time we get a message from a user. In this case, chatGPT is not able to provide context-aware responses. Hence, we are trying to store context at our end with DB. Then we send context every time we call APIs. However, we are having issues with this method. We are getting completely different responses even for the same user messages. Also, we are using open AI functions for our chatbot. Many times we are not getting proper parameters as well due to the above issue. Is there any way we can have a context-aware chatbot with chatGPT APIs? Please note, that we are calling API while our application instances are different. So, we call our JAVA webhook every time we get a new message and our JAVA webhook in turn calls these APIs.
Looking forward to some help regarding this issue.
P.S. We are using open AI chat completion API for our use case.
First, in regards to outputs that are not reproducible or reliable. You can use sampling parameters of the API to put the AI response on a course of answer generation of higher certainty:
"top_p": 0.5, "temperature": 0.5, ...
This will sample only from tokens within the highest certainty mass, and also will give more weight to tokens of more certainty.
Then, in the lack of expected AI understanding of input.
The AI model is stateless. Every input is independent.
You should leverage the messages format to the fullest, providing both a history of past user inputs and AI responses, and also injecting the retrieved knowledge in a way the AI would employ it. Here is an example:
system: You are a product ordering assistant for x company.
user: (what was previously said)
assistant: (the previous reply)
assistant: "Note to self: here’s information automatically retrieved from the knowledge base to help in answering the latest question: (injection)
user: How much does a widget cost?
Two ways to maintain conversation context. Both involve temporarily storing the chat history for each session.
Use the Standalone Question. https://youtu.be/B5B4fF95J9s?si=U8t89ts_SdY6-uTZ Essentially, you use the chat history and new question to form a standalone question which, by definition, includes the context of the conversation.
Use the chat history. This takes up more tokens and thus costs more, but you could theoretically send the chat history along with each new question to the LLM so it always knows exactly the nature of the conversation. With the larger token windows now available, this route is far more feasible than it was in the past.
It sounds like you’ve been storing the chat history, but aren’t retrieving it and inserting it into the prompt in a way that is working?