To my knowledge, GPT4’s API does NOT have internet access.
I read a few posts on this forum that cover the solution – ask the API for a search term, search it using a specialized search function, and then return the data to the API.
Has anyone built a function like this? I have an idea of how to do it myself, but obviously, would prefer using someone else’s pre-built GPT4-API-internet-access workaround
Newest preview models extend this with parallel function calling.
One can provide the AI two functions:
Internet search
Use a search API for a major provider, or a code shim that extracts search results, to give the AI top results with summaries and links;
Get page
Use a web scraping library to retrieve page contents, and you can instruct for multiple pages until good answers are found.
Just the search, with its many page descriptions, can be enough that the AI can answer or fact check current events.
On the modern web, many pages are dynamically rendered, javascript-based, or just anti-automation to get contents. This challenge can be partially overcome with technologies like Selenium that use control over a modern browser instead of direct document retrieval. You can also use APIs (free or paid) for selected sites in combination depending on how dedicated you are.
Multi-turn functions can “get next page chunk” “follow link” etc. With investment you can get better results than OpenAI, because of OpenAI respecting robots.txt, OpenAI not having browse technology you can extend, and websites seeing the requests coming directly from OpenAI.