The number of requests shown on the dashboard is higher than the number I actually sent through the OpenAI Python SDK.
After reviewing the logs, I found that a single request was processed twice, with two different request IDs (#resp_0c3eec25875b7f7900693d55f958ac819189b24770163e53ef, #resp_0f580796a1b3e86d00693d561aeb888194ac575af3406275ea), and both generated responses.
Certain errors are automatically retried 2 times by default, with a short exponential backoff.
Connection errors (for example, due to a network connectivity problem), 408 Request Timeout, 409 Conflict,
429 Rate Limit, and >=500 Internal errors are all retried by default.
You can use the max_retries option to configure or disable retry settings:
from openai import OpenAI
# Configure the default for all requests:
client = OpenAI(
# default is 2
max_retries=0,
)
# Or, configure per-request:
client.with_options(max_retries=5).chat.completions.create(
messages=[
{
"role": "user",
"content": "How can I get the name of the current day in JavaScript?",
}
],
model="gpt-4o",
)
Thus: “more than sent” can actually mean, “the number that were sent; more than were exposed”.