I am preparing jsonl
I want to train the model to use the prefix of the text to predict the sentence I want to type
For example
“The weather is nice today”
So when I type “t”, “w”, “i”, “n”, “t”
I hope he can answer me “The weather is nice today”
But it could also be “That’s why I need teamwork”
So I prepared the dataset as below
{“prompt”:“t w i n t”,“completion”:“The weather is nice today”}
{“prompt”:“t w i n t”,“completion”:“That’s why i need teamwork”}
Is this preparation in the right direction?
Or is there something I need to modify?
No. Your data is JSONL compliant but it does not meet the OpenAP data formatting requirements for fine-tuning.
Reference:
Preparing Your Dataset
See Also:
1 Like