ido
March 23, 2025, 10:21pm
1
I am using computer-use-preview-2025-03-11
to run a browser agent, both with:
No matter how I try to instruct it to use the browser’s back function, it never sends a back
action.
I can see the back
action is defined in the official Python example (here ), but I have never been able to make it trigger.
Besides the official Python example, I couldn’t find a list of supported actions, and the documentation page seems to be broken:
https://platform.openai.com/docs/assistants/tools-computer-use
Has anyone successfully triggered the back
action? Is there a workaround or an updated list of supported actions?
2 Likes
did you managed to solve it? I think I’m having a similar issue, in my case I’m using the agents sdk and AsyncComputer implementation
ido
May 11, 2025, 4:38pm
3
Sort of. I was able to prompt around it and hack it.
In my system prompt, I instructed the model to use a keyboard shortcut when it intends to use the back action and mapped that shortcut to the actual browser back action.
See here:
If unsure, take a screenshot once before proceeding.
Do not repeat actions that have no visible effect.
Available browser actions:
click, double_click, move, drag, scroll, type, keypress, wait, goto, back, forward, screenshot.
**Keyboard Shortcuts:**
This web browser is running on ${osName}.
Use OS-specific shortcuts. For example:
On macOS, "CMD" should be used where applicable.
On macOS if the user asks you to go back (to the previous page), use a combination of Cmd + [ keys.
`;
let previousResponseId;
let messages = [{ role: "system", content: initialSystemText }];
while (true) {
let userInput;
if (instructions.length > 0) {
userInput = instructions.shift();
2 Likes
That’s great, thanks for sharing!!
I’ve share my solution in a different post, using function tools. I don’t know why I can attach links, just look for the thread “Extending AsyncComputer methods in Agents SDK”. I’m attaching a picture of the response as well
Hope it helps!