@SearchUmbrella You may want to see my answer on this related post too GPT 5 100x token usage compared to GPT 4.1 - #6 by JuhanaT .
- GPT-5 and o-series: web_search_preview is $0.01 per web call plus token usage for the retrieved web content at the model’s token rates.
- GPT-4.1 family (includes 4o and 4.1-mini): web_search_preview is $0.025 per web call, and the web content tokens are included (not billed separately).
https://platform.openai.com/docs/pricing#built-in-tools
I did some tests and posted the demo code here: GitHub - erikmalk/web-search-demo: Simple CLI to show the difference between gpt-5 and gpt-4.1 web search cost formulas and how tokens are billed differently
This simple CLI runs web search with various models and calculates the final cost. There are options to set the query, model(s), max_tool_calls, search_context_size. It prints out the details for each query and shows a summary of the cost breakdown. e.g., I ran it with “what are some family friendly activities/events happening this weekend in NYC?”
Running gpt-4.1-mini...
Web search calls: 1
Usage (raw): {
"input_tokens": 347,
"input_tokens_details": {
"cached_tokens": 0
},
"output_tokens": 540,
"output_tokens_details": {
"reasoning_tokens": 0
},
"total_tokens": 887
}
Answer: This upcoming weekend in ...
Token cost: $0.001003
Token cost formula: ((uncached_input_tokens × inputPrice) + (cached_input_tokens × cachedPrice) + (output_tokens × outputPrice)) / 1,000,000
Where: uncached_input_tokens = input_tokens − cached_tokens
= ((347 × 0.4) + (0 × 0.1) + (540 × 1.6)) / 1,000,000
= ($0.000139 + $0.000000 + $0.000864) = $0.001003
Web search surcharge: $0.025000
Total estimated cost: $0.026003
...
Running gpt-5-mini...
Web search calls: 1
Usage (raw): {
"input_tokens": 13102,
"input_tokens_details": {
"cached_tokens": 8448
},
"output_tokens": 1060,
"output_tokens_details": {
"reasoning_tokens": 256
},
"total_tokens": 14162
}
Answer:
Great — here are family-friendly ...
Token cost: $0.003495
Token cost formula: ((uncached_input_tokens × inputPrice) + (cached_input_tokens × cachedPrice) + (output_tokens × outputPrice)) / 1,000,000
Where: uncached_input_tokens = input_tokens − cached_tokens
= ((4654 × 0.25) + (8448 × 0.025) + (1060 × 2.0)) / 1,000,000
= ($0.001164 + $0.000211 + $0.002120) = $0.003495
Web search surcharge: $0.010000
Total estimated cost: $0.013495
Summary
-------
Model Calls Token Cost ($) Web Cost ($) Total ($)
------------ ----- -------------- ------------ ---------
gpt-4.1 1 $0.005534 $0.025000 $0.030534
gpt-4.1-mini 1 $0.001003 $0.025000 $0.026003
gpt-5 1 $0.019883 $0.010000 $0.029883
gpt-5-mini 1 $0.003495 $0.010000 $0.013495
We can see that in this case gpt-5/mini were cheaper than 4.1/mini, because the price of web tokens were less than the higher per web call. That may not always be the case, depending on reasoning_effort, max_tool_calls, search_context_size, topic, etc.
I recommend testing with a new query each time, because using the same query can result in cached web tokens, that may not represent your actual use case. The 8448 cached tokens happened for me even on new queries, so that’s probably the internal web agent system prompt + tools and seems to almost always be cached.