SOLVED: Unable to generate file for fine-tuning in correct JSONL format

gxgx · February 16, 2023, 5:07am

Hello, I’m having trouble uploading files for fine-tuning.

I’m trying to generate a file in JSONL format for fine-tuning and send a POST request to the files endpoint. But it fails with 404 error. I would like to know if you know how to modify the file.

Line first of my jsonl file

{"prompt":"My stomach is growling","completion":"Would you like something to eat?"}
{"prompt":"I feel tired","completion":"That is a concern. If there is anything I can do for you, please let me know."}

Error message

{
    "error": {
        "message": "Expected file to have JSONL format, where every line is a JSON dictionary. Line 1 is not a dictionary (HINT: line starts with: \"{\"p...\").",
        "type": "invalid_request_error",
        "param": null,
        "code": null
    }
}

Thank you.

ruby_coder · February 16, 2023, 5:11am

Looks like your JSONL may have some hidden new lines (or maybe some other chars) per the error message.

How did you generate the JSONL file?

Did you try using an on-line JSON validator for each line?

gxgx · February 16, 2023, 8:45am

Thanks for the reply!
I’d like to achieve fine-tuning without coding. So I’m trying to convert a CSV file to JSOL by this tool and then send POST request to the files endpoint with Postman.
I have verified a few random data by this tool, but no errors.

Looks like your JSONL may have some hidden new lines (or maybe some other chars) per the error message.

I created training data in Japanese. Is that relevant?
Or, how can I know any hidden lines?

thank you.

gxgx · February 16, 2023, 9:28am

@ruby_coder

Thanks for the reply!
I’d like to achieve fine-tuning without coding. So I’m trying to convert a CSV to JSOL by TableConvert and then send a POST request to the files endpoint by Postman.

I have verified a few random data by JSONLint, but no errors.

Looks like your JSONL may have some hidden new lines (or maybe some other chars) per the error message.

I created training data in Japanese. Is that relevant?
Also, how can i find the “any hidden new lines” ?

Thank you.

gxgx · February 16, 2023, 9:29am

Here the tools.

TableConvert

JSONLint

Thank you.

gxgx · February 20, 2023, 3:52am

【Solved.】
I changed Unicode from UTF-8 “with Bom” to UTF-8 “without BOM”.
And fine-tuning succeeded!!
Maybe it was related to the fact that training data was created IN JAPANESE.

Thanks everyone.

Snifflefluff · March 30, 2023, 9:20am

For converting datasets between CSV and JSON Line formats, I suggest using the Online OpenAI Finetune tool. It is designed specifically for this purpose and you can access it at the following link:

I hope this helps. Thank you.

Topic		Replies	Views
Invalid file format- Issues with encoding different languages and emojis in Fine Tuning Community gpt-4 , fine-tuning	0	118	August 5, 2024
Finetuning via API issues with JSONL API	13	2804	April 1, 2023
Can't upload a file to finetune. Can't upload raw json. ChatGPT sends me in a circle of useless python scripts. Plz help API api	13	892	April 22, 2024
Get does not appear to be in valid JSON format. Please ensure your file is formatted as a valid JSON file. every time API	16	9490	April 28, 2023
ERROR in read_any_format validator: File 'my jsonl file' does not exist API api	7	781	December 25, 2023

SOLVED: Unable to generate file for fine-tuning in correct JSONL format

Related topics