SOLVED: Unable to generate file for fine-tuning in correct JSONL format

Hello, I’m having trouble uploading files for fine-tuning.

I’m trying to generate a file in JSONL format for fine-tuning and send a POST request to the files endpoint. But it fails with 404 error. I would like to know if you know how to modify the file.

  • Line first of my jsonl file
{"prompt":"My stomach is growling","completion":"Would you like something to eat?"}
{"prompt":"I feel tired","completion":"That is a concern. If there is anything I can do for you, please let me know."}
  • Error message
{
    "error": {
        "message": "Expected file to have JSONL format, where every line is a JSON dictionary. Line 1 is not a dictionary (HINT: line starts with: \"{\"p...\").",
        "type": "invalid_request_error",
        "param": null,
        "code": null
    }
}

Thank you.

Looks like your JSONL may have some hidden new lines (or maybe some other chars) per the error message.

How did you generate the JSONL file?

Did you try using an on-line JSON validator for each line?

1 Like

Thanks for the reply!
I’d like to achieve fine-tuning without coding. So I’m trying to convert a CSV file to JSOL by this tool and then send POST request to the files endpoint with Postman.
I have verified a few random data by this tool, but no errors.

Looks like your JSONL may have some hidden new lines (or maybe some other chars) per the error message.

I created training data in Japanese. Is that relevant?
Or, how can I know any hidden lines?

thank you.

1 Like

@ruby_coder

Thanks for the reply!
I’d like to achieve fine-tuning without coding. So I’m trying to convert a CSV to JSOL by TableConvert and then send a POST request to the files endpoint by Postman.

I have verified a few random data by JSONLint, but no errors.

Looks like your JSONL may have some hidden new lines (or maybe some other chars) per the error message.

I created training data in Japanese. Is that relevant?
Also, how can i find the “any hidden new lines” ?

Thank you.

1 Like

Here the tools.

TableConvert

JSONLint

Thank you.

1 Like

【Solved.】
I changed Unicode from UTF-8 “with Bom” to UTF-8 “without BOM”.
And fine-tuning succeeded!!
Maybe it was related to the fact that training data was created IN JAPANESE.

Thanks everyone.

3 Likes

For converting datasets between CSV and JSON Line formats, I suggest using the Online OpenAI Finetune tool. It is designed specifically for this purpose and you can access it at the following link:

I hope this helps. Thank you.

2 Likes