Incorrect API jsonl parsing

After innumerable tests I think I have found a probable bug in API validation process.
I’m testing the API response for batch purposes, so I created a test file with JSON lines in it. I have tried to upload this file but the API replies with error 400. I have validated the file content with an external jsonl validator that tells that the content is valid json.
The lines lenght (2 lines, for testing) are:
Line 1: 9614 characters
Line 2: 9625 characters
Frankly I don’t know what to do now.
I’m using POSTMAN to send the file. I double-checked each parameter, but Istill continue to receive this message:

{
    "error": {
        "message": "Invalid file format for Batch API. Must be .jsonl",
        "type": "invalid_request_error",
        "param": null,
        "code": null
    }
}

I have a very large project to run but I don’t know if I can go ahead.
It should be so easy for OpenAI organization to put up a JSONL validator that could explain the possible errors done in batch requests jsonl files.
A response like the one I report over here is completely useless. Is there anyone who can help? Thank you in advance!

One update that is very interesting for API people:
I modified the content of the test file putting the examples that are found at
https://platform.openai.com/docs/guides/batch
So now the test file contains the two lines of the example such as:

{"custom_id": "request-1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-3.5-turbo-0125", "messages": [{"role": "system", "content": "You are a helpful assistant."},{"role": "user", "content": "Hello world!"}],"max_tokens": 1000}}
{"custom_id": "request-2", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-3.5-turbo-0125", "messages": [{"role": "system", "content": "You are an unhelpful assistant."},{"role": "user", "content": "Hello world!"}],"max_tokens": 1000}}

You know what? THis is the reply of the API:

{
    "error": {
        "message": "Invalid file format for Batch API. Must be .jsonl",
        "type": "invalid_request_error",
        "param": null,
        "code": null
    }
}

I think that someone should look at this buggy API…

1 Like

In order to be of some help to the community I answer my own question since I managed to find the problem.
The file containing the batch requests must be formatted in UTF-8
BUT IT MUST NOT CONTAIN THE BOM AT THE BEGINNING!
The BOM is the three byte field that defines the content as UTF-8. But for some strange reason unknown to me the file with the requests MUST NOT CONTAIN THE BOM.
Once stripped the BOM away from the request file everything has started to work.

3 Likes

Thanks for coming back to let us know. Hopefully this helps someone in the future!

2 Likes

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.