So you have to teach them everything - like a procedure

i have been working on a selenium based and complient browser for GPT.

And its quite simple - you design a module where the browser is always open.

you can technically see everything that happens.
But its hard for you to track whats not…

thats it.
think about it.

This sounds very cool. And dangerous :laughing:

So are you giving GPT the power to perform browser actions based on a goal? Or how does it work?

Do you mean like ChatGPT’s Browser mode?

With Playwright it is possible to take over an existing browser session and this could be integrated with an AI agent to perform actions as needed. My preferred solution as I don’t always know exactly in advance how a webpage will behave.
Not sure about Selenium though. I’d expect one writes the scripts as needed and spins up the browser instance.

@generalbadwolf is it one of these or another solution?

yeah this idea is now obsolete.
ive stopped working on it because of that.
In fact… i wont even touch text browsers any more.