Fine-tune Chatgpt 3.5 "The job failed due to an invalid training file"

tfoneyj · October 8, 2023, 4:51am

I am trying to fine-tune Chatgpt 3.5 with my prepared jsonl file. It went pretty well when I trained it with 20 examples, but it failed everytime with the file of 320 examples. It shows “The job failed due to an invalid training file”, but I couldn’t see anything wrong with my file. Does anyone know the reason?

tfoneyj · October 8, 2023, 4:55am

Or is there a token limit for the training file? Need some help.

udm17 · October 8, 2023, 6:33am

Heyy.

The finetuning job has a token limit of 50 million tokens allowed, so I’m sure that is not the case for it.
There is an code given by OpenAI that validates training files and return whether the JSONL structure is valid or not. Have you used that ?

_j · October 8, 2023, 6:39am

A training file can fail on unexpected things, such as characters in bytes 128-255 that normally would not need to be escaped. Also pesky could be characters like tabs within your json. Multi-line inputs are right out (unlike several OpenAI fine-tune examples); every training conversation must be on one line, with escaped quotes and escaped newlines within json strings.

I hope that gives you the ability to look deeper into your file and perform more validations on the input you provide.

Topic		Replies	Views
Fine tuning error: The job failed due to an invalid training file. Unexpected file format, expected either prompt/completion pairs or chat messages API gpt-35-turbo , api , json , fine-tuning-problems , response_format	15	295	April 25, 2024
Error 'invalid training_file' Fine-Tuning gpt-3.5-turbo-0613 API fine-tuning-problems	2	1211	December 15, 2023
Why fine-tuning jsonl file validation gpt-3.5-turbo-0125 can fail? Are there any logs to check? API fine-tuning-problems	1	162	April 2, 2024
An error occurred while processing file 'file-name' and it cannot be used for fine-tuning. Details may be available in the file's status_details API fine-tuning , fine-tuning-problems	6	1417	September 18, 2023
How to overcome OpenAI fine-tuning training data token limit? API api	5	1804	December 18, 2023

Fine-tune Chatgpt 3.5 "The job failed due to an invalid training file"

Related Topics