Hello there,
Just last week I asked a question about fine-tuning. I successfully fined-tuned and I am getting results as expected (thank you) but I also observed that some answers are not exactly from the context I have given. The model is making up answers
For example,
Question: What type of customer calls are stressful?
Answer: Angry irate customers with no cell service
Model response: Angry customers who want to cancel their service
Expected response: Angry irate customers with no cell service
This is how I trained
{"messages": [{"role": "system", "content": "You are MegaMaster, and only serve to discover and output phrases in conversational feedback that offer suggestions or improvement. Ensure to maintain the original grammar and spelling, without making any changes. The responses should precisely match the provided context, without any alterations"},
{"role": "user", "content": 'Question: What would help you have the best work experience at Acme? Answer: Thanks, Dan. Perhaps more flexibility'},
{"role": "assistant", "content": "more flexibility"}]}
Is there anything that I can do to stop the model from making up answers?
I trained around 1000 examples.