Hi,
I’m trying to prepare a TXT file for fine-tuning.
For that, I have pairs of prompts and completions.
I tried to use the CLI data preparation tool to save time, but no matter what I do, the CLI data preparation tool takes everything as “completion” only without prompts.
I tried different lines, separators, adding the words “prompt” and “completion,” and even when I provide the actual format, it changes it to be only the Completion.
This is mine:
{“prompt”:" Human: How do I use my favorite navigation app?\nAI:, “completion”:" You can change the default navigation app at the system’s settings at the navigation settings section."}
Here’s what comes out:
{“prompt”:"",“completion”:" {“prompt”:" Human: How do I use my favorite navigation app?\nAI:, “completion”:" You can change the default navigation app at the system’s settings at the navigation settings section."}"}
Change the extension of your file from .txt to .jsonl, and use the suggested format - that is the easiest way to get it to work. The text option is only used when providing unformatted files, where everything is assumed to be completion
Thanks, I finally got it to work.
I copied your example, and for some reason, it caused me encoding issues, so I eventually copied the text from the guide into a JSON file as follows:
{“prompt”: “”, “completion”: “”}
However, I think the CSV file format would definitely be an easier approach for large chunks of texts and I’m going to give it a try.