API Dashboard Shows More Requests Than Sent via Python SDK

The number of requests shown on the dashboard is higher than the number I actually sent through the OpenAI Python SDK.

After reviewing the logs, I found that a single request was processed twice, with two different request IDs (#resp_0c3eec25875b7f7900693d55f958ac819189b24770163e53ef, #resp_0f580796a1b3e86d00693d561aeb888194ac575af3406275ea), and both generated responses.

The OpenAI SDK will retry upon timeout. This doesn’t mean that the generation didn’t happen - it means you didn’t receive it.

As one might read in the readme.md..


Retries

Certain errors are automatically retried 2 times by default, with a short exponential backoff.
Connection errors (for example, due to a network connectivity problem), 408 Request Timeout, 409 Conflict,
429 Rate Limit, and >=500 Internal errors are all retried by default.

You can use the max_retries option to configure or disable retry settings:

from openai import OpenAI

# Configure the default for all requests:
client = OpenAI(
    # default is 2
    max_retries=0,
)

# Or, configure per-request:
client.with_options(max_retries=5).chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "How can I get the name of the current day in JavaScript?",
        }
    ],
    model="gpt-4o",
)

Thus: “more than sent” can actually mean, “the number that were sent; more than were exposed”.