I built an LLM-powered tool that can comprehend any website structure and extract the desired data in the preferred format

I got frustrated with the time and effort required to code and maintain custom web scrapers, so I built a more generic ML-based solution for data extraction from unstructured websites (and potentially other sources).

One of the killer use cases of GPT is reformatting information from any format X to any other format Y, so I leveraged that to understand websites and extract any data in the preferred format:

Landing Page and Demo

We’re currently working on fine-tuning the platform and would love to have some early adopters test it out and provide feedback. Would love to hear your thoughts!


Great idea! My company has been looking for this type of resource for years. I used your early access form and filled out a request. Looking forward to seeing the results you send! Thanks.


Thanks for the feedback. We’re onboarding users on a rolling basis while still fine-tuning and testing the platform for different use cases. You’ll hear back from us soon.

Interesting! How do you make sure GPT-3 (or any other LLM) is reliably returning the same output data and not making things up e.g. if a field doesn’t exist or isn’t visible on the website?

Interesting Idea.
Is it open source? If not, will it be open source?


Open-sourcing parts of the tool like the CSS selector generator would be very cool.