Plugins - Extracting context from web pages

Howdy, I’m working on a few different plugins, and I’ve been working on one that would help developers program when given documentation. I’m trying to find a good way to search for relevant context on web pages, although I’m at a loss on how to approach this. I’m trying to replicate the same search approach the wait-listed search model does, and so far, I can get the correct pages and extract all important text using Puppeteer; however, I don’t know how to proceed beyond that. The wait-listed search model is decently fast from what I can tell, and I can’t tell if it runs embeddings every time it scans a page or if OpenAI has a vector DB setup. I guess what I’m asking is, what is the best method for extracting the context from web pages?

I wrote a howto on this last night.

See Building a ChatGPT Plugin: AI Web Surfer

Interesting, ok, so GPT3.5 is the approach. Great design! This makes sense to me. I’ll update my code and give it a shot.

1 Like