{
"id": "chatcmpl-AviKUm4jIwfF7ygU8QuHStGENXTlG",
"object": "chat.completion",
"created": 1738318462,
"model": "o1-preview-2024-09-12",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "",
"refusal": null
},
"finish_reason": "length"
}
],
"usage": {
"prompt_tokens": 126790,
"completion_tokens": 1174,
"total_tokens": 127964,
"prompt_tokens_details": {
"cached_tokens": 126592,
"audio_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 1174,
"audio_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
}
},
"service_tier": "default",
"system_fingerprint": "fp_6722463bcd"
}
Hi,
I have the same issue with o1-mini-2024-09-12
using the Batch API.
{
"id":"batch_req_XXX",
"custom_id":"XXXX",
"response":{
"status_code":200,
"request_id":"XXXX",
"body":{
"id":"chatcmpl-XXXX",
"object":"chat.completion",
"created":1738499187,
"model":"o1-mini-2024-09-12",
"choices":[
{
"index":0,
"message":{
"role":"assistant",
"content":"",
"refusal":null
},
"finish_reason":"length"
}
],
"usage":{
"prompt_tokens":1111,
"completion_tokens":512,
"total_tokens":1623,
"prompt_tokens_details":{
"cached_tokens":0,
"audio_tokens":0
},
"completion_tokens_details":{
"reasoning_tokens":512,
"audio_tokens":0,
"accepted_prediction_tokens":0,
"rejected_prediction_tokens":0
}
},
"service_tier":"default",
"system_fingerprint":"fp_f56e40de61"
}
},
"error":null
}
That’s very unfortunate because this is the case for 80% of the jobs in my batches, and therefore, I lost quite a bit of money.
Is it somehow possible to get the cut-off sequence?
"usage":{
"completion_tokens": 1174,
"completion_tokens_details": {
"reasoning_tokens": 1174,
"usage":{
"completion_tokens": 512,
"completion_tokens_details": {
"reasoning_tokens": 512,
All your completion tokens have been spent on reasoning tokens, so no tokens were left for the response itself.
Either increase the value of max_completion_tokens
in your API request or set reasoning_effort
to low
to spend fewer reasoning tokens.
Refer to https://platform.openai.com/docs/api-reference/chat/create.
Why do I need to set it explicitly, the completion token even not reach 4k
If you didn’t set max_completion_token
explicitly, the reason for the issue is hitting the context window limit: https://platform.openai.com/docs/models#o1. It is 128,000 tokens for o1-preview-2024-09-12
.