Can one use a Open Assistant Crawler that has indexed my site to create a one website search with OpenAssistant GPT?
In other words can I crawl my own site or multiple sites and feed data to an Assistant so someone visiting my site can search for blog and informational posts with specific key word(s) etc in the assistant? I know Google can do that/this too.
On a grander scale, Can I crawl a couple of specific widget websites and direct my sales funnel customers to one of those websites that has the widget they need?
Personally I’d recommend just using a Streamlit app in Python and Langchain to do all this. There’s no reason to limit yourself to just what the “Assistants” API is capable of, when you can build a solution that you yourself will own 100% of AND one that (becasue of Langchain) will mean you’re not teathered to just one Cloud Provider like OpenAI. You’ll be able to switch it to Anthropic Claude by editing ONE line of code.
And the answer is “yes” all this is possible, and not that difficult. And there are Langchain based apps that already provide all this. Sorry I don’t remember the app names, but google “Langchain RAG Web App” for example.
no prob. Here’s one from a good youtuber that’s recent:
The only thing you’d need to add to it is a crawler piece to load the RAG database. You can find that of course by googling “Web crawler in Python”.
We developed an AI salesman and crawled the Apify.com website daily, uploading the data to the OpenAI assistant. This process is automated through the Apify - OpenAI integration, which you may find useful. Detailed setup instructions can be found here. Disclaimer: I work at Apify.