Running o1-preview on a long series of request and hit this “request_timeout” error. Seems strange because the input is only ~2000 tokens and I believe the max_tokens
argument has been depreciated for the o1 models. Anyone know why it might hit this error? If nothing else, the error message needs to be updated given the max_tokens
argument is misleading
"error": {
"message": "Timed out generating response. Please try again with a shorter prompt or with `max_tokens` set to a lower value.",
"type": "internal_error",
"param": null,
"code": "request_timeout"
}
Presuming it might have filled the output token limit with reasoning tokens and therefore not reached the output (?) but unsure (note: no max_completion_tokens
parameter was set). Surely the API doesn’t fail if the reasoning tokens reach the limit?
If anyone has dealt with this error before would be much appriciated!