The OpenAI SDK will retry upon timeout. This doesn’t mean that the generation didn’t happen - it means you didn’t receive it.
As one might read in the readme.md..
Retries
Certain errors are automatically retried 2 times by default, with a short exponential backoff.
Connection errors (for example, due to a network connectivity problem), 408 Request Timeout, 409 Conflict,
429 Rate Limit, and >=500 Internal errors are all retried by default.
You can use the max_retries option to configure or disable retry settings:
from openai import OpenAI
# Configure the default for all requests:
client = OpenAI(
# default is 2
max_retries=0,
)
# Or, configure per-request:
client.with_options(max_retries=5).chat.completions.create(
messages=[
{
"role": "user",
"content": "How can I get the name of the current day in JavaScript?",
}
],
model="gpt-4o",
)
Thus: “more than sent” can actually mean, “the number that were sent; more than were exposed”.