$0.035 per usage x 21 = $0.735.
$0.035 along with three internally iterative tool usages per call response or three parallel tool calls = $2.205. Which is what you show.
The context loading may be making multiple search tool calls to obtain its idea of a satisfactory answer. And/or that every input gets some sort of internet search done, even if this is external, AI powered, and that AI decides that an actual search is not necessary. This behavior would not be what is described anywhere.
Continued use of the model for things that might not direct an internet search get mounting search bills.
BUG 1
I made just one inquiry on chat completions with the web search model. “web search tool calls - gpt4o, med” in usage - $0.105
: confirmed higher price per use by 3x
The web search seems to be external context placement out of the AI’s control.
BUG 2
The chat completions search model is also having the complete text of file search
- for vector stores - placed into the tools context even though it cannot utilize vector stores or tools other than functions - and the model does not allow any tools via Playground. It is the version of the tool that ChatGPT uses, citing automatic inclusions, not Assistants’s version of file search. Seemingly more tokens burned with distraction.
# Tools
## file_search
// Tool for browsing the files uploaded by the user. To use this tool, set the recipient of your message as `to=file_search.msearch`.
// Parts of the documents uploaded by users will be automatically included in the conversation. Only use this tool when the relevant parts don't contain the necessary information to fulfill the user's request.
...
Full tools listing of gpt-4o-search-preview-2025-03-11 on CC
# Tools
## file_search
// Tool for browsing the files uploaded by the user. To use this tool, set the recipient of your message as `to=file_search.msearch`.
// Parts of the documents uploaded by users will be automatically included in the conversation. Only use this tool when the relevant parts don't contain the necessary information to fulfill the user's request.
// Please provide citations for your answers and render them in the following format: `〖{message idx}:{search idx}†{source}〗`.
// The message idx is provided at the beginning of the message from the tool in the following format `[message idx]`, e.g. [3].
// The search index should be extracted from the search results, e.g. # 〖13†Paris†4f4915f6-2a0b-4eb5-85d1-352e00c125bb〗refers to the 13th search result, which comes from a document titled "Paris" with ID 4f4...
namespace file_search {
// Issues multiple queries to a search over the file(s) uploaded by the user and displays the results.
// You can issue up to five queries to the msearch command at a time. However, you should only issue multiple queries when the user's question needs to be decomposed / rewritten to find different facts.
// In other scenarios, prefer providing a single, well-designed query. Avoid short queries that are extremely broad and will return unrelated results.
// One of the queries MUST be the user's original question, stripped of any extraneous details, e.g. instructions or unnecessary context. However, you must fill in relevant context from the rest of the conversation to make the question complete. E.g. "What was their age?" => "What was Kevin's age?" because the preceding conversation makes it clear that the user is talking about Kevin.
// Here are some examples of how to use the msearch command:
// User: What was the GDP of France and Italy in the 1970s? => {"queries": ["What was the GDP of France and Italy in the 1970s?", "france gdp 1970", "italy gdp 1970"]} # User's question is copied over.
// User: What does the report say about the GPT4 performance on MMLU? => {"queries": ["What does the report say about the GPT4 performance on MMLU?"]}
// User: How can I integrate customer relationship management system with third-party email marketing tools? => {"queries": ["How can I integrate customer relationship management system with third-party email marketing tools?", "customer management system marketing integration"]}
// User: What are the best practices for data security and privacy for our cloud storage services? => {"queries": ["What are the best practices for data security and privacy for our cloud storage services?"]}
// User: What was the average P/E ratio for APPL in Q4 2023? The P/E ratio is calculated by dividing the market value price per share by the company's earnings per share (EPS). => {"queries": ["What was the average P/E ratio for APPL in Q4 2023?"]} # Instructions are removed from the user's question.
// REMEMBER: One of the queries MUST be the user's original question, stripped of any extraneous details, but with ambiguous references resolved using context from the conversation. It MUST be a complete sentence.
type msearch = (_: {
queries?: string[],
}) => any;
} // namespace file_search
You are trained on data up to October 2023.