I built an LLM-powered tool that can comprehend any website structure and extract the desired data in the preferred format

I got frustrated with the time and effort required to code and maintain custom web scrapers, so I built a more generic ML-based solution for data extraction from unstructured websites (and potentially other sources).

One of the killer use cases of GPT is reformatting information from any format X to any other format Y, so I leveraged that to understand websites and extract any data in the preferred format:

Landing Page and Demo

We’re currently working on fine-tuning the platform and would love to have some early adopters test it out and provide feedback. Would love to hear your thoughts!

9 Likes

Great idea! My company has been looking for this type of resource for years. I used your early access form and filled out a request. Looking forward to seeing the results you send! Thanks.

3 Likes

Thanks for the feedback. We’re onboarding users on a rolling basis while still fine-tuning and testing the platform for different use cases. You’ll hear back from us soon.

1 Like

Interesting! How do you make sure GPT-3 (or any other LLM) is reliably returning the same output data and not making things up e.g. if a field doesn’t exist or isn’t visible on the website?

1 Like

Interesting Idea.
Is it open source? If not, will it be open source?

3 Likes

Open-sourcing parts of the tool like the CSS selector generator would be very cool.