Why fine-tuning jsonl file validation gpt-3.5-turbo-0125 can fail? Are there any logs to check?

_j · April 2, 2024, 5:25am

Are you receiving an error simply when uploading and then monitoring the progress of file verification?

Here’s basic code to check token counts. It also uses JSON which will fail if a line is not valid JSON.

Change the max_line (value 52 specifically to produce errors) to something smaller than your model context length. Glancing through, I think the overhead per message also should have been 4 instead of 3.

There’s other characters that seem to be refused in the past: bytes above 128 and above ASCII (mostly accented characters) may be better trained after fully-converted to UTF-8.

Topic		Replies	Views
Fine-tune Chatgpt 3.5 "The job failed due to an invalid training file" API chatgpt , api	3	2347	October 8, 2023
Invalid fine tuning training file even with a 34 character file that validates API	2	222	May 25, 2024
Error when upload files using CLI or when trying to use API API chatgpt	0	1140	November 9, 2023
Failing to Create Fine Tune Tasks API	3	1692	December 18, 2023
An error occurred while processing file 'file-name' and it cannot be used for fine-tuning. Details may be available in the file's status_details API fine-tuning , fine-tuning-problems	6	1921	September 18, 2023

Why fine-tuning jsonl file validation gpt-3.5-turbo-0125 can fail? Are there any logs to check?

Related topics