GPT Web browser method, im new, is there a reason why people havent thought of this?

I’ve noticed there’s not much discussion about web browser apps for GPT and other AIs. I have a theory that I’d like to share, even though I’m aware I might be off base.

My idea involves segmenting web pages by images and matching them against an accessibility grid. This approach could enhance how AIs interact with web content, potentially improving both functionality and user experience.

What do you all think? Is this something worth exploring further?

I like this idea actually. Id pursue further research on this topic. It’s probably just very complex to setup and I’m sure someone’s already working on it. props to you for thinking about this. I wonder if apples integration would have this feature. Especially with the new announcement of the eye tracker.

1 Like

What would be the difference, in your view, what are you doing with the accessibility grid that would be different than the LLM just calling the, let’s say, Chrome API and doing whatever it needs, or using things like Selenium or Puppeteer?

1 Like