Does the line order of the jsonl file affect fine tuning result?

seokhyunan · November 9, 2023, 7:29am

I plan to fine tune davinci-002 using the legacy fine tuning API.

Do I need to initially shuffle the dataset (that must be shuffled for normal fine tuning) before submitting .jsonl file to fine tuning API? Or, do the API automatically and randomly sample the data from the dataset to construct a batch for each step?

(I strongly suspect that they would use random sampler for the training, but I found that some examples in OpenAI cookbook shuffles their dataset before submitting the jsonl file to the API.)

Topic		Replies	Views
Order of fine tuning API	4	1023	November 9, 2023
Do the fine tuning API automatically shuffle the dataset? API api	6	875	April 10, 2024
Order of finetuning data? API	4	736	August 16, 2024
Structured Output and Element Ordering API gpt-4	0	88	June 17, 2025
Will repeated finetuning runs produce the same model? API fine-tuning , davinci	0	425	November 24, 2023

Does the line order of the jsonl file affect fine tuning result?

Related topics