Chat Model Best Practices and Logical Approaches?

I’m developing a Chatbot to support my business, as everyone seems to be. I know it’s a very new field, but I have some unanswered development questions that maybe people could shed some light on.

  1. What is best practice for using few-shot learning with a chat model?
    I’m using few-shot learning to teach the model my formatting, as well as some specific information that will aide it’s answering. However, this is also a chat model, so it includes chat history back to some point. I’m wondering if it is best practice to put the few-shot learning first or last, or somewhere in between? Does the ordering of the messages sent to GPT matter? And if so, what effect does it have (is there a recency bias to more recent messages?)

  2. What is common practice for chat history?
    Obviously if I include chat history back forever it becomes prohibitively expensive. On the other hand, no chat history makes my product confusing and hard to use. I’m toying with the idea of including chat history back to a certain max number of tokens, and I’m also toying with the idea of using GPT to summarize the history, that way future messages will cost less tokens. The problem is I have no idea how to approach this problem, or how to figure out what the solution is.

Honestly, these are just the 2 questions I have at the moment, but I think my question is broader in the sense that I need to understand how to approach these kinds of problems. How can I logically make decisions? How can I test changes on a non-anecdotal scale to see if they are improving my product? How do I approach balancing the cost with product capabilities?

I think I’m used to making things that have industry standards and established ways of thinking. This is just so new to me, and so new to everyone, I’m wondering if anyone has any thoughts on this.

1 Like

It’s a trade between quality/cost and how much elbow grease it’s worth.

It looks like your goal is to have a Chatbot that reflects knowledge (data) you provide it, as well as an output format. These things are different enough, I would use embeddings (Retrieval Augmented Generation) for knowledge, and another X-shot prompt to get the formatting right, or try the built-in function capability. Either way, these are two different things that need to be tackled, and once tackled, cascaded together to form your final answer. (Perhaps over a few API calls)

For chat history, at least for me, it’s a mix between recent, and previous related history using embeddings. You decide the mix of recent/history and tokens that feed the context for the model to decipher, and leave enough room (tokens) for the expected max completion (answer) to come back.

So a few variables here, but nothing crazy.

1 Like

I think you may find these free, short courses by Andy Ng and OpenAI staff very informative.