I am working on cleaning up our bot implementation and would like some advice on best practice.
On the table are two styles:
multiple “assistant” / “user” exchanges
A single “user” message that covers everything.
(1) for example would be:
user: CONTEXT: search results for cars are: …
user: “sam: how are you today”
assistant: “I am fine”
user: "cam: search for cars please
assistant: “there are 17 cars here”
user: “who am I?”
(2) for example would be a single user message:
CONTEXT: search results for cars are: …
Conversation history is:
sam: how are you today
AI: I am fine
cam: search for cars please
AI: here are 17 cars here
sam: who am I?
AI:
2 feels a bit like cheating since we are using devinci techniques for prompting in chat. However I find chat (1) is a bit limiting given:
name only works in GPT-4 it appears
More tokens are used in (1) given both “user” and “assistant” are counted afaik
Context is too “loose” in (1) given it feels like we mentioned it way before, and gluing context in the middle just destroys conversation flow, from experiments.
Langchain appears to use (2) so there is precedent here
Claude only does (2) anyway, so it reduces amount of code one needs to carry when integrating with multiple llms.
I use multiple exchanges as a way to “fine tune” the bot, for example to control the tone of the response. If you’re giving context, which may include a conversation history, the second technique may work better.
You can also use multiple exchanges to guide it away from certain answers e.g. “user: no, that doesn’t work because X. give an answer with Y.”
I am seeing the replies (at least for GPT 3.5) are far more detailed when stuff is multi turned. (user / assistant / user / assistant VS a single user message)
I wonder if prompt engineering can circumvent this.
I get away these days with a single message and no examples (all examples live in system prompt), the later versions of GPT 3.5 and 4 are a bit more forgiving than the early days.