O1-mini model output '' with finish_reason of length

I’m not sure if this is a bug, but when generating text with the o1-min model, if the total number of input and output tokens exceeds around 16,000, I get a finish_reason: length, and the generated text becomes ''.

According to the documentation, the context window is supposed to be 128,000 tokens, with a maximum output of 65,536 tokens, so this seems like it could be a bug.

OR, is there any parameter to declare that a long context will be used?

completion:  {
  id: 'chatcmpl-xxxxxxxxxxxxxxxxx',
  object: 'chat.completion',
  created: 1731396367,
  model: 'o1-mini-2024-09-12',
  choices: [ { index: 0, message: [Object], finish_reason: 'length' } ],
  usage: {
    prompt_tokens: 13506,
    completion_tokens: 4000,
    total_tokens: 17506,
    prompt_tokens_details: { cached_tokens: 0, audio_tokens: 0 },
    completion_tokens_details: {
      reasoning_tokens: 4000,
      audio_tokens: 0,
      accepted_prediction_tokens: 0,
      rejected_prediction_tokens: 0
    }
  },
  system_fingerprint: 'fp_xxxxxxxx'
}
choice:  {
  index: 0,
  message: { role: 'assistant', content: '', refusal: null },
  finish_reason: 'length'
}

It is because you are setting the max_completion_tokens to 4000. Your report shows that your expense of internal reasoning tokens cuts off right at this number. Set that to a value that would only trigger if something goes wrong, such as 67890.

That parameter is a control of the total expense for all generated tokens, both of internal reasoning iterations and received output. It is an instruction to shut off generation if exceeding that length.

1 Like