Open AI charging too much for web searches?

LouisDeconinck · March 12, 2025, 4:15pm

I’m using the new OpenAI Agents SDK to build an agent which uses the web search tool.

You can see that I’ve done 21 web searches so far for which I’ve been charged over $2 for.

According to their pricing documentation this cost $35/1k tool calls, so that would give me a total of $0.74.

Am I missing something?

_j · March 12, 2025, 4:41pm

$0.035 per usage x 21 = $0.735.

$0.035 along with three internally iterative tool usages per call response or three parallel tool calls = $2.205. Which is what you show.

The context loading may be making multiple search tool calls to obtain its idea of a satisfactory answer. And/or that every input gets some sort of internet search done, even if this is external, AI powered, and that AI decides that an actual search is not necessary. This behavior would not be what is described anywhere.

Continued use of the model for things that might not direct an internet search get mounting search bills.

BUG 1

I made just one inquiry on chat completions with the web search model. “web search tool calls - gpt4o, med” in usage - $0.105 : confirmed higher price per use by 3x

The web search seems to be external context placement out of the AI’s control.

BUG 2

The chat completions search model is also having the complete text of file search - for vector stores - placed into the tools context even though it cannot utilize vector stores or tools other than functions - and the model does not allow any tools via Playground. It is the version of the tool that ChatGPT uses, citing automatic inclusions, not Assistants’s version of file search. Seemingly more tokens burned with distraction.

# Tools

## file_search

// Tool for browsing the files uploaded by the user. To use this tool, set the recipient of your message as `to=file_search.msearch`.
// Parts of the documents uploaded by users will be automatically included in the conversation. Only use this tool when the relevant parts don't contain the necessary information to fulfill the user's request.
...

Full tools listing of gpt-4o-search-preview-2025-03-11 on CC

# Tools

## file_search

// Tool for browsing the files uploaded by the user. To use this tool, set the recipient of your message as `to=file_search.msearch`.
// Parts of the documents uploaded by users will be automatically included in the conversation. Only use this tool when the relevant parts don't contain the necessary information to fulfill the user's request.
// Please provide citations for your answers and render them in the following format: `〖{message idx}:{search idx}†{source}〗`.
// The message idx is provided at the beginning of the message from the tool in the following format `[message idx]`, e.g. [3].
// The search index should be extracted from the search results, e.g. # 〖13†Paris†4f4915f6-2a0b-4eb5-85d1-352e00c125bb〗refers to the 13th search result, which comes from a document titled "Paris" with ID 4f4...

namespace file_search {

// Issues multiple queries to a search over the file(s) uploaded by the user and displays the results.
// You can issue up to five queries to the msearch command at a time. However, you should only issue multiple queries when the user's question needs to be decomposed / rewritten to find different facts.
// In other scenarios, prefer providing a single, well-designed query. Avoid short queries that are extremely broad and will return unrelated results.
// One of the queries MUST be the user's original question, stripped of any extraneous details, e.g. instructions or unnecessary context. However, you must fill in relevant context from the rest of the conversation to make the question complete. E.g. "What was their age?" => "What was Kevin's age?" because the preceding conversation makes it clear that the user is talking about Kevin.
// Here are some examples of how to use the msearch command:
// User: What was the GDP of France and Italy in the 1970s? => {"queries": ["What was the GDP of France and Italy in the 1970s?", "france gdp 1970", "italy gdp 1970"]} # User's question is copied over.
// User: What does the report say about the GPT4 performance on MMLU? => {"queries": ["What does the report say about the GPT4 performance on MMLU?"]}
// User: How can I integrate customer relationship management system with third-party email marketing tools? => {"queries": ["How can I integrate customer relationship management system with third-party email marketing tools?", "customer management system marketing integration"]}
// User: What are the best practices for data security and privacy for our cloud storage services? => {"queries": ["What are the best practices for data security and privacy for our cloud storage services?"]}
// User: What was the average P/E ratio for APPL in Q4 2023? The P/E ratio is calculated by dividing the market value price per share by the company's earnings per share (EPS).  => {"queries": ["What was the average P/E ratio for APPL in Q4 2023?"]} # Instructions are removed from the user's question.
// REMEMBER: One of the queries MUST be the user's original question, stripped of any extraneous details, but with ambiguous references resolved using context from the conversation. It MUST be a complete sentence.
type msearch = (_: {
queries?: string[],
}) => any;

} // namespace file_search

You are trained on data up to October 2023.

vb · March 12, 2025, 5:30pm

The cost for the web search tool are based on input and output tokens plus costs per session.

The dashboard screenshot you are sharing displays the number of searches only.

The table detailing the costs with ‘Search context size’ shows costs per tool invocation. You are likely asking for the number of tokens consumed by the model using the tool before providing the final answer.

You can find these values using the legacy dashboard via the activity tab.
I hope this helps!

_j · March 12, 2025, 5:56pm

The cost is not “per session” (like a code interpreter session that lasts an hour before re-billing you).

What is shown in screenshot is the usage page line-item for the separate use of web search tool - which is distinctly metered in “calls”.

Model	Cost
gpt-4o or gpt-4o-search-preview low	$30.00 1k calls
medium (default)	$35.00 1k calls
high	$50.00 1k calls
—	—
gpt-4o-mini or gpt-4o-mini-search-preview low	$25.00 1k calls
medium (default)	$27.50 1k calls
high	$30.00 1k calls

Two usages:

But for those two calls, you receive the billing of six usages:

vb · March 12, 2025, 6:09pm

You are not wrong but the answer is incomplete. Below are the input and output tokens for the search tool from the same pricing page.

On the activity dashboard we can see the token usage, but not per tool.

_j · March 12, 2025, 7:04pm

When specifically using the gpt-4o-search-preview model, a model name specially added by OpenAI to provide internet search on Chat Completions without needing a tool specification (as you cannot use internal tools, only your functions), you’d be paying the model’s token pricing for the amount of input context results that allows the AI to answer.

However, that input consumption should be under the model name’s token billing in the usage page, not the search tool call invocation billings that would give you no clue about any token count or token consumption (no need to mention chat completions not billing internet tokens of input…).

The token consumption will also be returned in the API’s token usage object.

The “overbilling” per call seems far too consistent - an exact multiplier of “per call” - to be a result of dynamic web search size.

dignity_for_all · March 12, 2025, 8:00pm

I also used the search tool once with the “Medium” setting and once with the “High” setting. The cost was $0.07 for one Medium usage and $0.06 for one High usage.

According to the pricing table, the cost per Medium usage should be $0.035 per call(since it’s $35 per 1K calls), and for High, it should be $0.05 per call (since it’s $50 per 1K calls).

The amount charged on the Usage page roughly matches the expected cost when using High, but it does not match at all when using Medium.

The cost for using Medium once was double the expected price from the pricing table.
At that time, I was only using the built-in search tool as a tool.

duncansmothers · March 18, 2025, 9:15pm

In your opinion, does this mean if ResponseFunctionWebSearch is used and the response returns “usage=Usage(requests=1, input_tokens=499, output_tokens=449, total_tokens=948”

That we’d use the $35/1k calls number for the tool + $2.50/mill input + $10/mill output?

It doesn’t look like there’s a way in the agents SDK to change the model for search so assuming it’s defaults to 4o not mini.

_j · March 18, 2025, 10:32pm

I thought I’d again investigate the bill placed on a “clean” project from a usage of this model on chat completions.

(Asking the search model itself on “high” is useless):

But the intention was to get billed.

The cost impact now seems repaired - the expected $0.05 per “high” gpt-4o call:

The legacy usage page gives the model’s token consumption:

and the Playground’s report of that call:

Comparing favorably to the billing a week before of two calls:

(A day with other billing issues, like complementary data sharing tokens being billed on Responses.)

kaufmandak · March 31, 2025, 6:34am

Was anyone able to get to the bottom of what’s happening here?

I too am also seeing roughly 3x the charges for search tool calls vs. what their pricing would imply. You can see below that on Mar 28th I was charged $40.43 for 462 web search requests, which equals $0.0875 per request or $87.5 / 1000x requests which is roughly 3x their stated pricing per request.

Anyone have an update on what could be happening here? Support has not been helpful suggesting that it might be output token costs associated with the tool call – but if that was the case I would expect it to show up as an “output” line item in the billing.

_j · March 31, 2025, 11:08am

I made 3 uses yesterday, responses api + tool, likely gpt-4o (just checking something), got billed $0.09 for low under the category “web search”. The token usage was under the complementary tokens with training so don’t show up as tokens. pricing was normal

Tool use should be thus completely separate from i/o tokens when it is a tool call, all contained within the model billing. You could be using the tool of fixed price with many different models.

If using gpt-4o-search preview by necessity on chat completions, openai hadn’t put the free tokens into “free”, but they show up separate from the line item.

In the usage page, for the day, hover over the web searches count for the day versus the line item billing for the day. See if they are consistent by day recently, and if an overbilling happened in the past, but was repaired like I report on CC and also show now to be correct on Responses.

blake24 · April 17, 2025, 9:44pm

I’m having the exact same issue. And like others have said, it’s pretty much always 3x what the price should be based on OpenAI’s billing page.

Anyone get to the bottom of this??

mattjkarlson · May 8, 2025, 10:39am

yes, you did!

Recommend people check out your post (i cannot include link, forum won’t let me)

TLDR: Shows as one request in dashboard, it actually makes 3 sub queries and you get billed for all of them, actual cost is 2-3x what you see on the pricing page.

blake24 · May 10, 2025, 8:24am

Thanks Matt!

Here’s my original post going over why exactly the web-search tool charges more than expected (based on conversation with OpenAI support):

Hopes this helps someone!!

Topic		Replies	Views
Heads up: Web Search Tool Billing Can Be Higher Than You Expect (Here’s Why) Feedback api-costs	6	2457	September 25, 2025
Deep research in the API, webhooks, and web search with o3 Announcements	19	6027	July 7, 2025
O3-deep-research - 1 million tokens spent .. no output :( API deep-research	37	2272	November 27, 2025
Assistants API pricing details per message API api-billing	68	42083	January 29, 2024
ChatGPT 5 API with all options set to low very high Token Count Bugs	7	768	August 23, 2025

Open AI charging too much for web searches?

BUG 1

BUG 2

Related topics