How to implement GPT4 API with internet access?

gorvgoyl · December 3, 2023, 5:51am

As the question said what are some easy and economical ways to implement Internet access in GPT4 API to get the latest information about world events?

Would prefer a node.js solution if possible.

PS: Internet access is not yet provided via API. Web ChatGPT version does this by crawling Bing.

itschoolhaker · December 3, 2023, 6:14am

You can use Google queries or other search engines to receive a list of sites and their descriptions in response. You can also ask GPT to make a more precise request in advance

_j · December 3, 2023, 6:31am

First is to find a search API, unless you have a specific site you want crawled and indexed.

There’s a github repo for a duckduckgo python method that doesn’t cost - as long as it still works. Bing APIs, APIs that pirate Google…

So then you have function that can get URLs and search summaries.

Browsing arbitrary web pages is harder. You’re going to get tons of “Javascript required” if you just wget.

Beautiful Soup, Selenium - browse directly with more js
puppeteer (Pyppeteer), Playwright - browser controllers

ajit.singh · December 3, 2023, 2:52pm

To ensure that I am understanding this right, are you suggesting that we install a 3rd party search API to return the top urls and write a script to scrape those pages? Should we then use embedding to perform a search on that scraped content?

sorry if it’s too obvious and I am not getting it.

_j · December 3, 2023, 4:39pm

You can’t embed the whole web unless you are Google.

You don’t have to “install” a third party search API, you call the network search API with your AI function-handling code. Some shims that are already written can be time-saving though.

raymondyeh · December 3, 2023, 10:57pm

You could set up function call.

This is how it might look:

send user message to the API that may contain request to pull content from the internet
api responds with the function call to your “internet access” function
your app performs the “internet access” function and return the results to API as system message for further processing

One such example will be to use google search API:

user message shows intent to use search
openai api ask to search with the specific keyword
your software searches with the keyword and returns the result
chatgpt process the search result and return user with the reply from there

gorvgoyl · December 4, 2023, 11:09am

Thanks, folks. Now my next worry is how to save tokens. If I start parsing web pages then it would consume lots of tokens. Is there a way to fetch information by saving as many tokens as possible?

PS: I use Node.js so prefer a solution that works with that.

raymondyeh · December 4, 2023, 7:53pm

Depending on your use case, if you are talking about reading any webpage, you might want to use something like JSDom and return only the text content instead of the entire html document. Other solution like Cheerio and Puppeteer also can work.

gorvgoyl · December 5, 2023, 8:59am

Thanks, any clever tips to further reduce token consumption like offloading the webpage text summarisation to some other cheaper LLM service?

alfonsofr · December 5, 2023, 9:44am

You could tinker with the tiktoken Python package and lossless compression. See (link at the bottom of the page)
https://platform.openai.com/tokenizer

rsisto · March 12, 2025, 4:08pm

Hi ! Great thread, how did you end up doing this? I need to inyect more precise and up to date information to openai api queries, I’d like to know your approach so I can implement something similar myself

Topic		Replies	Views
GPT4 API Internet Access -- What to Do API	5	8695	May 2, 2024
[Need help] Issue with the API's results API	1	418	February 24, 2024
Want Internet Access with gpt-4 openAI model to collect latest information about any company or anything Community gpt-4 , chatgpt , api , langchain , openai	9	7182	November 29, 2023
API for searching the latest information on the internet API gpt-4 , chatgpt , plugin-development , api , chatgpt-plugin	10	8474	February 20, 2025
We require current data for crafting questions and evaluating answers accurately. Additionally, we need the capability to access the internet to obtain this up-to-date information API gpt-4 , gpt-35-turbo , fine-tuning , api , chatgpt-plugin	3	1149	September 26, 2023

How to implement GPT4 API with internet access?

Related topics