Apologies for the long post ahead, I hope you can bear with me as I am trying to learn
Background: My task is getting GPT to write a marketing email for our customers. This email’s content and writing styles need to be tailored based on:
- content (e.g., types of products we want to promote)
- customer segment (e.g., frequent buyer, lapsed, etc)
- time of the year (e.g., black Friday)
So far, I have used GPT-4-turbo in a RAG + fewshot learning approach by prompting and giving example emails (we have a lot of historical emails that we can call ‘ground truth’) to get decent results. Let’s say this series of prompts and intermediary outputs form the ‘dialogue’. However, one problem I still have is the writing style of the email sometimes appear superficial. For example:
- Emails often start like ‘I hope this email finds you well…’ but I want something like ‘Hey X…’, ‘Clock is ticking!’…
- Too formal languages like ‘imagine/envision…’ but I want ‘Just think about… do you know…’
To fix these issues I already tried extra prompting such as ‘Look for phrases like X, Y, Z and replace them with something informal…’ but I had little success.
Fine tuning: So my thought therefore is to fine tune my own model that can write emails in the style I want. I did some research on this forum and found discussions like this, this, this, and this. I have tried something that failed and I suspect I did it wrong. So here I am going to explain in more detail.
Initially, I thought I could fine tune a GPT3.5 so it learns the ‘writing style’ but also retains the ability as a general purpose chatbot. But this didn’t work. What I did is similar to this thread, i.e., pairs of previous/next sentences, in the hope that the fine tuned model learns the writing style. My training data are like:
{"messages": [{"role": "system", "content": ""}, {"role": "user", "content": "sentence1 from example emails"}, {"role": "assistant", "content": "sentence2 from example emails"}]}
{"messages": [{"role": "system", "content": ""}, {"role": "user", "content": "sentence2 from example emails"}, {"role": "assistant", "content": "sentence3 from example emails"}]}
...
But it appears that all the model has learned is ‘mapping’ my input text to another text. And it also lost the general chatbot ability. For example, if I say ’
Please write a marketing email for mobile phones’
it does not actually do that but produces another sentence that looks random and makes no sense.
So my Question1 is, fine tuning will cause the model to lose its general chatbot nature and can only do one thing, that is mapping your input to output in the training data. Is this correct?
What next: Building on this, I am rethinking my approach and I have two ideas.
First, following this thread, I think I could fine tune using this setup:
{"messages": [{"role": "system", "content": ""}, {"role": "user", "content": "email with a neutral tone/style"}, {"role": "assistant", "content": "the ground truth email"}]}
, where the ‘user content’ would be the initial email in the neutral tone and the assistant’s output would be the final email in the desired tone. To create the neutral toned email, I could use an idea like the text neutraliser. Then I could approach my task in a ‘hybrid’ manner: RAG+few shot learning on GPT4 turbo using the dialogue to get an output email (email with a neutral tone/style), then use the fine tuned model above to ask it to revise its style to get the final output email.
So my Question2 is, is this a reasonable approach? In the fine tuning data, do I need to add an instruction in the ‘user’ content, say ‘Rewrite the following email to improve its style: [email with a neutral tone/style]’?
Second, this thread says that the fine tuning data can contain multiple messages, not just a single ‘user-assistant’ content pairs. So I think I could try a setup that includes the whole ‘dialogue’ that I use in my RAG+few shot approach, like this:
{“messages”: [the ‘dialogue’]} =
{“messages”: [{“role”: “system”, “content”: “”}, {“role”: “user”, “content”: “prompt 1”}, {“role”: “assistant”, “content”: “output1”},{“role”: “system”, “content”: “”}, {“role”: “user”, “content”: “prompt 2”}, {“role”: “assistant”, “content”: “output2”}, …]}
So my Question3 is, is this a reasonable approach? When using the fine tuned model, what will my input be? Do I prompt it step-by-step (prompt1, wait for answer; prompt2, wait for answer), or do I need to compose the whole dialogue chain then feed it as one input?
Thanks for taking your time to read this! Any comments - either general comments or answers to any of the questions - are highly appreciated!!