I’m developing a Chatbot to support my business, as everyone seems to be. I know it’s a very new field, but I have some unanswered development questions that maybe people could shed some light on.
-
What is best practice for using few-shot learning with a chat model?
I’m using few-shot learning to teach the model my formatting, as well as some specific information that will aide it’s answering. However, this is also a chat model, so it includes chat history back to some point. I’m wondering if it is best practice to put the few-shot learning first or last, or somewhere in between? Does the ordering of the messages sent to GPT matter? And if so, what effect does it have (is there a recency bias to more recent messages?) -
What is common practice for chat history?
Obviously if I include chat history back forever it becomes prohibitively expensive. On the other hand, no chat history makes my product confusing and hard to use. I’m toying with the idea of including chat history back to a certain max number of tokens, and I’m also toying with the idea of using GPT to summarize the history, that way future messages will cost less tokens. The problem is I have no idea how to approach this problem, or how to figure out what the solution is.
Honestly, these are just the 2 questions I have at the moment, but I think my question is broader in the sense that I need to understand how to approach these kinds of problems. How can I logically make decisions? How can I test changes on a non-anecdotal scale to see if they are improving my product? How do I approach balancing the cost with product capabilities?
I think I’m used to making things that have industry standards and established ways of thinking. This is just so new to me, and so new to everyone, I’m wondering if anyone has any thoughts on this.