Hi, guys. I have a question.
When composing a dataset for fine-tuning, can it recognize the user’s content using only keywords without expressing the entire question?
Let me give an example.
When wanting to train on user prompt data like ‘What is the singularity of artificial intelligence?’, using just the keyword ‘Artificial intelligence singularity’ in the user’s content.
Full
{“messages”:[{“role”:“user”,“content”:“What is the singularity of artificial intelligence?”}
,{“role”:“assistant”,“content”:"The singularity of artificial intelligence is… "}]}
Simplified
{“messages”:[{“role”:“user”,“content”:“Artificial intelligence singularity”},
{“role”:“assistant”,“content”:"The singularity of artificial intelligence is… "}]}
Can using such simplified data with only keywords for training still enable the model to respond as accurately as if it were trained with the full data?"