Claude Computer-Use - Do you think OpenAI has something similar

bloodlinealphaDev · November 5, 2024, 9:25pm

I tested Anthropic’s (Claude) computer-use Demo last week and it is pretty interesting. Its limited, slow, and buggy but you can do some pretty cools things still.

Has anybody seen any news or videos hinting that openai but have something similar?

It is a really interesting concept and way to automate tasks or research. I know they have mentioned an o1 model that could think for weeks, months, etc… but I do not see how this would be useful without a feedback loop (google searches, data extraction, analysis) adding new data or context.

I was thinking if you could train a model on a series of videos, where you are performing a specific task, using an existing app (photoshop, google sheets, notion) and finetune it to perform the task the way you prefer, that would be extremely useful…

It could get to the point where you have an AI worker, running in a container, performing marketing, research, analysis, and other tasks. It is pretty crazy to think about.

platypus · November 5, 2024, 9:34pm

Regarding your last sentence - I did exactly this with just API calls - essentially different “agents” under the hood - one collecting market dynamics for a target company, another creating a cultural assessment, another collecting M&A news, etc (the context is private investment). There was no need for anything like what Anthropic demonstrated - ChatCompletions API with GPT-4o and feeding it right data sources to query and crawl was more than enough.

I don’t see why it wouldn’t be possible to swap out Claude and just use GPT-4o or o1 family of models. My feeling is that people are still trying to understand a very concrete use case, beyond just a cool demo. RPA has been around for a very long time now.

bloodlinealphaDev · November 6, 2024, 1:14am

That sounds great. I am more thinking of scenarios that are bit more complex such as using a specific app, like Canva, and getting it to create an infographic or slideshow. Or some type of multi-tool workflow where you use different tools that may not have built-in integrations.

For example, “help me verify these engineering calculations, then open AutoCAD and create an engineering diagram for {x} part”…

platypus · November 6, 2024, 8:57am

Yes, I’m sure there are some use cases out there like that. What immediately comes to my mind is accessibility use cases - for people that may have problems with mobility this would be super useful.

bloodlinealphaDev · November 6, 2024, 1:32pm

Ya especially when combined with voice control and/or voice feedback. That is another great use case.

Topic		Replies	Views
Feature Request: First-class Object References for Function Calls Feedback api , feature-request , community-feedback	3	748	December 2, 2023
Seeking Ideas for creating a New AI Tool Based on OpenAI API - Your Suggestions are Welcome! Community chatgpt	11	2815	October 20, 2024
ChatGPT goes Multimodal! Sound and vision is rolling out on ChatGPT Community chatgpt , multimodal	34	12331	December 10, 2023
What are you creating using the OpenAI API? Community project , api , development	11	282	November 18, 2024
GPT-4-Vision Interesting Uses and Examples Thread (2023) Community gpt-4-vision	24	11744	April 22, 2024

Claude Computer-Use - Do you think OpenAI has something similar

Related topics