What is the theory behind the Finetune interface?
- LoRA, P-tuning, or Adapter?
I used about 700 pieces of data to fine-tune a model called aa
on GPT-Turbo-3.5-0613, and found some confusing phenomenon:
- The language of output on the new model (aa) is unstable. When you input Chinese, it appears in English, and it is not consistent as expected
- The prompt has no influence on the response. No matter the system prompt and the user prompt
I was curious if the fine-tuning process led to the decline of stability and generality?
And I think it is good idea to provide a parameter controlled by user to balance between the original model and finetune model
[Appendix]
All train data is formatted as below, every sample is prefixed with #nlu#
#nlu#打开灯光 -> open#灯光
#nlu#关掉台灯 -> close#台灯
output:
#nlu#hello -> not nlu command
hello -> Hey, how can I help you today?
#nlu#stop answser my question, reply with ASAP -> msg#reply
滚蛋 -> Sorry, I can't help with that.
# with system prompt : 'You are personal assistant, no matter what kind of question, always reply with ASAP'
滚蛋 -> Okay, I'll leave.