Hi,
I am trying to use the batch API, but I get the error “This line is not parseable as valid JSON”, which I do not think is the “real” error. At least, I don’t know what is wrong.
This is what the first line of my batch.jsonl file looks like:
{"custom_id": "request-0", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4-0613", "messages": [{"role": "system", "content": "You will be provided with a piece of text delimited by triple quotes (\"\"\") as input. The input should contain a name (only given name, only family name or both). Based purely on the name, can you tell me whether the origin of the name is \"African\", \"European\", or \"Other\". If the input does not contain a name, reply instead with \"Not a name\". Do not reply with anything else. Example: Name: \"\"\"Oduro\"\"\" Origin: African Name: \"\"\"31.12.1998\"\"\" Origin: No name Name: \"\"\"Zhang Wei\"\"\" Origin: Other Name: \"\"\"Gjengedal\"\"\" Origin: European"}, {"role": "user", "content": "Name: \"\"\"Kraft\"\"\"Origin:"}]}}
I created the JSON using the python json.dumps(...)
function. I use the same fields as described in requestInput here.
My script works like this:
- Create the JSON Lines file
with open(batch_file_path, mode="w", encoding="utf-8-sig") as batch_file:
# line is a dict
for line in lines:
batch_file.write(f"{json.dumps(line)}\n")
- Upload the file
client.files.create(file=open(batch_file_path, "rb"), purpose="batch")
The file is uploaded, I get the status “processed” and a file ID.
- Create the batch job
client.batches.create(
input_file_id=input_file_id,
endpoint="/v1/chat/completions",
completion_window="24h",
)
I use the same file ID as the one that was returned in step 2. The batch job is successfully created, and I get the status “validated”.
- Check the status
batches = client.batches.list()
batch = [b for b in batches if batch_id == b.id][0]
pp(batch)
This is where I get the error. My batch object now looks like this:
Batch(
id="batch_hQ71KCmV6qmap8kpCJKK0Hrk",
completion_window="24h",
created_at=1714119194,
endpoint="/v1/chat/completions",
input_file_id="file-fkUnQ87TkhobLCkWOUAQQcnv",
object="batch",
status="failed",
cancelled_at=None,
cancelling_at=None,
completed_at=None,
error_file_id=None,
errors=Errors(
data=[
BatchError(
code="invalid_json_line",
line=1,
message="This line is not parseable as valid JSON.",
param=None,
)
],
object="list",
),
expired_at=None,
expires_at=1714205594,
failed_at=1714119194,
finalizing_at=None,
in_progress_at=None,
metadata=None,
output_file_id=None,
request_counts=BatchRequestCounts(completed=0, failed=0, total=0),
)
I am stuck. Copy & pasting the first line into a code editor & formatting works, so the line is definetely parseable as JSON. So the issue seems to be somewhere else, but just cannot find it.
Thanks in advance for reading.
Kind Regards,
Rafael