Having 'from' instead of 'role' as keyword in jsonl files

I see in some jsonl files on HuggingFace website that use the key word ‘from’ instead of ‘role’. Does that prevent one from validating the file by OpenAI API that is destined for fine-tuning?

Any idea?

2 Likes

Validating the fine-tuning dataset is a practical requirement if we want to achieve the desired results. When fine-tuning OpenAI models, much of the heavy lifting has already been done, so we can rely on the cookbook and the documentation as references. On top of that, the data is validated again before the fine-tuning job actually starts.

In the case of supervised fine-tuning, for example, using role is a requirement.

I cannot say much about the wide range of datasets available on Hugging Face, but I hope this is already helpful.

3 Likes