I’m seeing that prompts that invoke the web_search_preview tool often execute the same seemingly search over and over again, taking significant time and costing significant tokens. This snippet from the OpenAI dashboard log contains an example that included 20 searches, and cost 89,060 tokens (resp_6873940bb6f0819387d8af90957c029602ce8bca57714ba7):
This doesn’t happen 100% of the time - I’m seeing it about 70-80%, though. When it does not occur, this exact prompt costs around 2,000 tokens.
When this occurs, I also get the full streamed response over and over again - it appears that it is identical each time.
Reproduce:
I have only seen this happen under the following circumstances. I’m hesitant to say that these are requirements, but this is my anecdotal experience:
- I’ve only seen this happen on prompts that have at least one additional tool configured, in addition to the web_search_preview tool.
- I’ve only seen this happen on responses that invoke a Web Search. It doesn’t appear to happen when other tools are invoked.
Run a prompt that is likely to do a web search, i.e. - Asking for sports scores. In this case, I used “What’s the score of the Mets game?”