How can I accurately know the number of times web_search tool is used in a single response api call?

I’m using the built-in web_search tool in OpenAI, but I’m confused about the pricing.
My use case is with gpt-4o-mini (no reason), the tool choice is set to required, and I’m calling it via the Responses API.

How can I accurately determine how many times the web_search tool is used in a single API call?
In the response, there’s an output field containing a ResponseFunctionWebSearch object — does each of these objects represent one web_search usage?

Since the number of web_search calls is the main cost driver in my scenario (rather than token usage), I really need to understand precisely how to count web_search invocations.

Also, does the search_context_size level affect the cost of a single search call?

res = await async_openai_client.responses.create(
    model="gpt-4o-mini",
    instructions="xxx",
    tools=[WebSearchToolParam(type="web_search", search_context_size="low")],
    input=messages,
    tool_choice="required",
    temperature=0,
    store=False,
    truncation="auto",)

Welcome to the developer community, @ch_l.

Yes, every ResponseFunctionWebSearch object of type "web_search_call" will have the exact query string used for that tool call contained within the ActionSearch object, which would correspond to exactly one instance of web_search tool use.

Quoting from docs:

Output and citations

Model responses that use the web search tool will include two parts:

  • A web_search_call output item with the ID of the search call, along with the action taken in web_search_call.action. The action is one of:
    • search, which represents a web search. It will usually (but not always) includes the search query and domains which were searched. Search actions incur a tool call cost (see pricing).

There are two components that contribute to the cost of using the web search tool:

  1. Tool calls
  2. Search content tokens.

Tool calls are billed per 1,000 calls, according to the tool version and model type. The billing dashboard and invoices will report these line items as “web search tool calls.”

Search content tokens are tokens retrieved from the search index and fed to the model alongside your prompt to generate an answer. These are billed at the model’s input token rate, unless otherwise specified. This is affected by search_context_size.

For gpt-4o-mini and gpt-4.1-mini with the web search non-preview tool, search content tokens are charged as a fixed block of 8,000 input tokens per call.

Thank you. I see.

And what about web search preview? I use gpt-4o-mini-search-preview, and the return value contains no information indicating how many times search has been used. Does this mean that each request is definitely counted as only one search?

If you are using a special Chat Completions search model, there should only be one billing possible per model call.

There is no internal tool iterator to a chat AI that can make multiple calls, and any ability for a search intermediary agent to write multiple queries into an array for a user desire or need is not exposed. You can think of it more like auto-RAG based on context, although it is not described technically.

You simply always get billed “search” fee for every API call.

2 Likes