What you are describing is functions.
You can offer the AI a function specification for an internet search engine. Give the function the parameters that your chosen engine can employ besides the query itself, such as a date range or recent_days, or result count requested.
tools = [{
"type": "function",
"function": {
"name": "internet_search",
"description": (
"Duckduckgo internet search API, returns top results to query\n"
"- Use search when information newer than pretraining is needed"
),
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string",
"description": "long natural language allowed"},
"recent_days": {"type": "number",
"description": "how many days back from today"},
},
},
},
}]
Ask the AI about something like recent OpenAI announcements, and the AI will emit a function call for you to handle:
{
"id": "call_FkebOrc3a3nn352fajfoa3",
"function": {
"arguments": "{\n \"query\": \"OpenAI blog announcements March 2024\",\n \"recent_days\": 30\n}",
"name": "internet_search"
},
"type": "function"
}
(here, the AI told about it’s knowledge cutoff and today’s date is able to make wise decisions)
It is then up to you to handle and fulfill that search request, likely with a paid API call to another search service.
Further browsing of web pages is problematic, because of dynamic javascript-powered content that is nearly universal.