Repeated, apparently identical Web Searches on single prompt

I’m seeing that prompts that invoke the web_search_preview tool often execute the same seemingly search over and over again, taking significant time and costing significant tokens. This snippet from the OpenAI dashboard log contains an example that included 20 searches, and cost 89,060 tokens (resp_6873940bb6f0819387d8af90957c029602ce8bca57714ba7):

This doesn’t happen 100% of the time - I’m seeing it about 70-80%, though. When it does not occur, this exact prompt costs around 2,000 tokens.

When this occurs, I also get the full streamed response over and over again - it appears that it is identical each time.

Reproduce:

I have only seen this happen under the following circumstances. I’m hesitant to say that these are requirements, but this is my anecdotal experience:

  • I’ve only seen this happen on prompts that have at least one additional tool configured, in addition to the web_search_preview tool.
  • I’ve only seen this happen on responses that invoke a Web Search. It doesn’t appear to happen when other tools are invoked.

Run a prompt that is likely to do a web search, i.e. - Asking for sports scores. In this case, I used “What’s the score of the Mets game?”

1 Like

It is for reasons like this that I have given up on the web search tool.

Common issues:

(1) A response that replicates the same URL. Example: When prompted to provide information about crime in Nice France, it will provide repetitive links to the same URL for the same article in La Monde, but for different crimes - instead of discussing the different crimes and providing one URL. It need to apply reasoning in order to consolodate repetitive URLs.

(2) My Favorite: It will provide URLs to sites that require a paid subscription (that you don’t have) like FT (Financial Times).

Maybe the OpenAI Staff has some insight on this…

The model in the screenshot is gpt-4o-mini.

That is the fault. It is poor, degrades with large context input such as a RAG result, and turns one year old this week.

If you want function-calling that doesn’t go into a loop of writing the same thing, again sending to a tool recipient based on repeating a pattern, you’ll need to upgrade the brains.