I hate to write something so open ended, but the feature seems to be completely broken.
Using the basic example from the docs, with my own prompt and setting temperature to 0:
client = OpenAI()
response = client.responses.create(
model="gpt-4o",
temperature=0,
tools=[{"type": "web_search_preview"}],
input="What is the current weather at Stevens Pass?"
)
print(response)
Here is the “get code” of a successful Prompts Playground call, answering the question. The main difference is a system message (or instructions parameter), the location field being passed (where you could even try exactitude), and using low context size.
Thank you! I am kind of getting more consistent annotations with your setup, but still not 100% of the time. Sometimes annotations come back blank.
On the other hand, most of the data is still wrong.
Even in your screen shot you see the 47 Low for Monday and 58 High, which is not in a forecast. stevenspass.com never had these numbers anywhere on the page.
The other annotation it sometimes adds is “weather.gov”, it picks a point forecast that is even colder than what is on stevenspass.com.
So I tried to see if it gives me correct current weather and simply hallucinates the rest and asked it to just give me high and low temps for Tuesday March 25, 2025.
It keeps come up with very bad answers that are again nowhere on the page that it cites.
Ex:
[AnnotationURLCitation(end_index=236, start_index=128, title='', type='url_citation',
url='https://www.j2ski.com/snow_forecast/United_States/Stevens_Pass_weather.html?utm_source=openai')]
The weather forecast for Stevens Pass, WA, on Tuesday, March 25, 2025,
indicates a high of 67°F (20°C) and a low of 45°F (7°C).
It is possible the tool-calling AI is done with one call of just page results with short descriptions, and is not exploring into pages to get more useful info.
Probably better to just write a free weather function (or on a cheap commercial API) of your own to have ready, rather than paying for a much slower search and exploration of what the web randomly offers.
I spent a bit of time just making one, which uses AI and a ranked geolocation fuzzy search API for it to choose from, to give lat/lon forecasts
What the AI would emit:
Enter city name to search (e.g., 'Miami'): SFO
Enter state for geo AI (e.g., 'FL'):
Enter any location notes: Just north of Daly City by the bay
Weather Forecast for San Francisco, San Francisco County County, California:
Current Weather at 2025-03-24 06:15 UTC:
Temperature: 50.7 °F
Wind: 4.8 mp/h at 292°
Cloud Cover: 2%
Precipitation: 0.0 inch
Temperature every 6 hours:
2025-03-24 00:00 UTC: 64.8
2025-03-24 06:00 UTC: 50.9
2025-03-24 12:00 UTC: 48.8
2025-03-24 18:00 UTC: 62.8
2025-03-25 00:00 UTC: 78.1
2025-03-25 06:00 UTC: 63.5
2025-03-25 12:00 UTC: 56.5
2025-03-25 18:00 UTC: 68.4
2025-03-26 00:00 UTC: 74.4
2025-03-26 06:00 UTC: 64.2
2025-03-26 12:00 UTC: 57.5
2025-03-26 18:00 UTC: 59.3
2025-03-27 00:00 UTC: 61.3
2025-03-27 06:00 UTC: 57.1
2025-03-27 12:00 UTC: 54.1
2025-03-27 18:00 UTC: 56.2
2025-03-28 00:00 UTC: 57.7
2025-03-28 06:00 UTC: 52.8
2025-03-28 12:00 UTC: 52.1
2025-03-28 18:00 UTC: 56.8
2025-03-29 00:00 UTC: 56.4
2025-03-29 06:00 UTC: 51.6
2025-03-29 12:00 UTC: 48.9
2025-03-29 18:00 UTC: 55.0
2025-03-30 00:00 UTC: 56.6
2025-03-30 06:00 UTC: 53.4
2025-03-30 12:00 UTC: 48.0
2025-03-30 18:00 UTC: 52.6
Precipitation Probability every 6 hours:
2025-03-24 00:00 UTC: 0
2025-03-24 06:00 UTC: 0
2025-03-24 12:00 UTC: 0
2025-03-24 18:00 UTC: 0
2025-03-25 00:00 UTC: 0
2025-03-25 06:00 UTC: 0
2025-03-25 12:00 UTC: 0
2025-03-25 18:00 UTC: 0
2025-03-26 00:00 UTC: 0
2025-03-26 06:00 UTC: 0
2025-03-26 12:00 UTC: 1
2025-03-26 18:00 UTC: 3
2025-03-27 00:00 UTC: 9
2025-03-27 06:00 UTC: 17
2025-03-27 12:00 UTC: 16
2025-03-27 18:00 UTC: 19
2025-03-28 00:00 UTC: 16
2025-03-28 06:00 UTC: 11
2025-03-28 12:00 UTC: 6
2025-03-28 18:00 UTC: 8
2025-03-29 00:00 UTC: 7
2025-03-29 06:00 UTC: 5
2025-03-29 12:00 UTC: 3
2025-03-29 18:00 UTC: 2
2025-03-30 00:00 UTC: 9
2025-03-30 06:00 UTC: 26
2025-03-30 12:00 UTC: 48
2025-03-30 18:00 UTC: 59
You don’t see AI output here, this is code and an injection ready for AI.
(ready for plotting by-the-hour for a UI also)
So now you’re up to date and don’t have to ask for a while.
I’m trying to create an agent that performs my morning web browsing for me, so far the web search is failing horribly in every aspect. It is not pulling this mornings news, and even when I specify the sites that I want it to search, it is still producing results that I cannot find on those sites even though it adds the specific pages on the sites where the information is found to annotations. So I think my conclusion still stands, the web search simply doesn’t work.
Interestingly the same query passed to ChatGPT with Web Search produces results that can actually be found on the pages that it sites, but it completely ignores words “today” and “yesterday” and searches all results regardless of when they showed up.
So I guess Web Search is slowly coming online, but OpenAI is defaulting to “release-now-fix-bugs-later” approach, which may be ok at the experimental stage of these tools.
When using API, Web search pulls data that doesn’t exist on the internet and more specifically doesn’t exist on pages that it cites in the annotations field.
The second bug is that it often doesn’t cite any web sites, but I still see that it performed a web search. (because output contains 2 items, first of which is “web_search_call”)
The model cannot directly access websites you specify.
It must do, and only has, a web search query that it can invoke.
From the results, it can then enter the site, but rarely seems willing to, and ultimately, the amount of exploring is out of your control except for the context “high” parameter.
It happened, is news, but is not how CNN appears. OpenAI may want to cache a million developers that all find CNN a good source. All the citation links only lead to the site, not an article page.
And seem to get the model’s own knowledge from 2024 or just bad search engine results that are more topic-based then timeliness-based.
Solution still: use an AI completely under your control with a search API you can observe, scrapers more advanced for sites, and functions you write that are followed.