I’m not sure if this is a bug, but when generating text with the o1-min model, if the total number of input and output tokens exceeds around 16,000, I get a finish_reason: length
, and the generated text becomes ''
.
According to the documentation, the context window is supposed to be 128,000 tokens, with a maximum output of 65,536 tokens, so this seems like it could be a bug.
OR, is there any parameter to declare that a long context will be used?
completion: {
id: 'chatcmpl-xxxxxxxxxxxxxxxxx',
object: 'chat.completion',
created: 1731396367,
model: 'o1-mini-2024-09-12',
choices: [ { index: 0, message: [Object], finish_reason: 'length' } ],
usage: {
prompt_tokens: 13506,
completion_tokens: 4000,
total_tokens: 17506,
prompt_tokens_details: { cached_tokens: 0, audio_tokens: 0 },
completion_tokens_details: {
reasoning_tokens: 4000,
audio_tokens: 0,
accepted_prediction_tokens: 0,
rejected_prediction_tokens: 0
}
},
system_fingerprint: 'fp_xxxxxxxx'
}
choice: {
index: 0,
message: { role: 'assistant', content: '', refusal: null },
finish_reason: 'length'
}