Yes, I’m using the Responses endpoint, but I didn’t really change anything about the tools, they are just default values.
Each line I have in the batch files looks like this:
{
"custom_id": f"word_{i}_request_{request_idx}",
"method": "POST",
"url": "/v1/responses",
"body": {
"model": "o4-mini",
"input": [
{"role": "system", "content": prompt},
{"role": "user", "content": ""},
],
},
}
and nothing else. I’m using it for generating text, so I left the user message empty.
Specifically, I have made 3,100 lines in the batch file, have 2,917 complete responses and 183 failed ones because of quota limit. I have the same number of lines in my batch output and error json files, respectively.
But the usage page in the dashboard says I have a total of 3,831 requests, also with about 30% more input and output tokens than I get in the output jsonl file. The logs page also says I have 3,831 results, but I can’t investigate more because maybe all I could do to see all the result lines is just scroll down to get several more lines each time.
I didn’t use the playground or anything else at the same time as this batch. The AI agent in the Help Center says it might be because of automatic retries, but I’m not convinced since all of the number of request, input tokens and output tokens are significantly larger in my opinion.
I appreciate your help.