Another possibly noob question that came to me while putting together a dataset for fine-tuning (I want open-ended generation):
What happens if you have multiple prompts that have the same completion? For example the prompts “What country is Toronto in?”, and “What country has a flag with a maple leaf on it?” both have “Canada” as their completion. If I want my system to answer both of these prompts, should I split them into different prompts? e.g.
A)
{“prompt”: “What country is Toronto in?”, “completion”: “Canada”}
{“prompt”: “What country has a flag with a maple leaf on it?”, “completion”: “Canada”}
Or should I roll them into one large prompt? e.g.
B)
{“prompt”: “What country is Toronto in? What country has a flag with a maple leaf on it?”, “completion”: “Canada”}
Thanks for helping me out every time