I have been trying to run computer-use-preview model via openai sdk. I have read the documentation
At this moment it says:
from openai import OpenAI
client = OpenAI()
response = client.responses.create(
model="computer-use-preview",
tools=[{
"type": "computer_use_preview",
"display_width": 1024,
"display_height": 768,
"environment": "browser" # other possible values: "mac", "windows", "ubuntu"
}],
input=[
{
"role": "user",
"content": "Check the latest OpenAI news on bing.com."
}
# Optional: include a screenshot of the initial state of the environment
# {
# type: "input_image",
# image_url: f"data:image/png;base64,{screenshot_base64}"
# }
],
reasoning={
"generate_summary": "concise",
},
truncation="auto"
)
print(response.output)
As for my task I wanted to start with input image of screenshot as my initial state. So, naturally, I uncommented the code, tried to run, but got the error about invalid type. I have been confused and went straight to look at the repo with examples: simple_cua_loop.py
However, I have never got any idea how to start the computer-use-preview model with both text prompt and my own screen.
At the end I figure out the solution and the correct request looks like this:
response = client.responses.create(
model="computer-use-preview",
tools=[{
"type": "computer_use_preview",
"display_width": 1024,
"display_height": 768,
"environment": "windows" # "mac", "browser", "ubuntu"
}],
input=[
{
"role": "user",
"content": "run demo app"
},
{
"role": "user",
"content": [{
"type": "input_image",
"image_url": f"data:image/png;base64,{screenshot_base64}"
}]
}
],
reasoning={
"generate_summary": "concise",
},
truncation="auto"
)
print(response.output)
I hope someone will find this comment usefull, because in time when I needed it I have found zero posts about using computer-use-preview from openai