Can LinkReader Plugin be used with the OpenAI API?

Hey guys,

Can anyone let me know if we can actually use the LinkReader Plugin of OpenAI in the OpenAI API ? I want to be able to send a request to OpenAI API saying that “Crawl through google for this article and give me the response in a html format”. I was able to do that using the OpenAI Web application but no luck with the API.

I get that both of them are different and the plugin functionality is not actually integrated with the API yet but are there are any workarounds for this? I checked out Langchain but I dont exactly see how i can do it.

Thankyou!

You’ll need to code your own or use something like LangChain. No way to use plugins in API.

Here’s a LangChain Google Search example

2 Likes

You could extract the link and use wget on the command line.
It’s a pretty powerful tool to extract content from a website.

Or you could write a website scraper yourself. Ask ChatGPT to explain you how.

1 Like

So, If I am essentially am able to use LangChain Google Search. Will Chatgpt be able to provide me the response in html format? I am actually confused a little over this. I am going to use Langchain google search instead of OpenAI API , so how will chatgpt give me the response?

I tried it out already. I want this to be purely automated, so I want a crawler which will go through Google or Bing and search for the articles I needed and only scrape the text i want and convert it to HTML. Writing that code will surely be large and I will also need to check out ratelimits and captcha defenses and more.

Nah,

just learn how to write a scraper.

Could be done as a piped oneliner on bash like

curl to google search api | awk sed | someloop wget --someoption | awk,sed, some regex | curl to api …

2 Likes

Yeah if you aren’t using AI to summarize or process the data, and are just looking for a tool to extract HTML from pages in a Google Search result, then you just need a web scraper.

ChatGPT could help you write one if you are familiar with any programming/scripting languages. Or there’s a bunch of different Chrome Extensions and other web scraping software out there.

1 Like

Its not only about HTML actually. I want Chatgpt also read some articles and generate YAML code depending upon the articles found from crawling. If its only simple HTML, I would have surely used some scraping with awk , sed and wget.

There are multiple steps in this project and I will surely need chatgpt to perform processing of data later.

Can you share more about your overall goal? Sounds like a multi-step process may be what you need, generic web scraper to get the HTML, then use OpenAI API’s to read/process the text.

1 Like

So, I am a security guy and I am trying to play around with Chatgpt to essentially find any new exploits/CVE’s released as the first step. Once it finds the newly released CVE’s , I want it to use google, github and more sources to find the exploit code and build a YAML Template from the exploit code which can then be sent over to some scanners in my testing. But the issue here is, OpenAI API does not have Plugins functionality and I am unable to get the YAML back I need.

I am successful in making chatgpt retrieve the CVE IDs I need. But using the API to perform the crawling is not possible and hence the failure. I played around with scrapers to get the exploit codes but I feel its not an efficient process in anyway.

Can you share link to an example chat?
It all seems doable with the API, but you’ll need to cobble together several pieces. Generally URL reading plugins just fetch the page content and include it in prompt.

I feel like when you crawl docs for let’s say programing libs then embeddings would work very well too. Especially since data of gpt-3/4 is becoming outdated.

News should be added to the prompt though.

(https://chat.openai.com/share/be054ad5-6352-4f1f-8ece-1fe5068b8831)

Looks pretty evil to me haha :slight_smile:

So what exactly is missing?

Haha. I want to receive this YAML Code from the API so that I can use it in my python script. But LinkReader does not work with the API and I receive some gibberish instead of the YAML Code.

Looking at the requests made to the Link Reader plugin you can see that it works in 3 steps.

First there is a google search.

Then it seems like ChatGPt decides that the second link in the response matches the original request the most and then the third as well.

Here you can find the API manifest and OpenAPI definition which tells ChatGPT how to handle the data.

But the prompt used by ChatGPT is something you need to figure out for yourself.

Btw. I don’t really see how ChatGPT gets the informations about the yaml from the plugin. Could as well be hallucinated for newer versions.

I dont exactly get how Chatgpt created the YAML too because I get a different YAML every single time i generate the response and it seems like it uses different link for it then. I am doubtful if LinkReader is the correct one for this task but it seems to work in some cases but not all.

Like I said. It can be hallucinated. You could try asking more specific e.g. to let the crawler find an example yaml and then add the neccessary data…

I mean did you try asking it to create such a yaml without crawling the web?