I’m working on a Chrome extension in which I pass the current tab’s URL in my prompt and then query about it.
I can see that the 4o model can read the URL content, while 4o-mini obviously can’t and hallucinates based on the URL itself.
The problem is that 4o also sometimes seems to have the same problem. How can I tailor my prompt to minimize the occurrence of this problem?
My prompt already includes text “Do not guess based solely on the name, URL, or external cues; rely strictly on the page contents.” This seems to help a bit, but it is not enough.
Currently it is not possible to do a web search or URL scrape using the API. So whatever results you have seen, they have been hallucinations or a fluke. Right now the only way to fetch/scrape contents from a URL using the API is if you perform your own web scraping using something like BeautifulSoup or an external service like Apify.
It’s scarily impressive how often 4o was able to create plausible brief summaries of a page without seeing anything besides the URl – enough that it had me fooled for a while.
I guess that says something deep and philosophical about how much of what we see in this world is redundant.