New tools for building agents: Responses API, web search, file search, computer use, and Agents SDK

Today, we released our first set of tools to help you accelerate building agents. These building blocks will help you design and scale the complex orchestration logic required to build agents and enable agents to interact with tools to make them truly useful.

Introducing the Responses API

The Responses API is a new API primitive that combines the best of both the Chat Completions and Assistants APIs. It’s simpler to use and includes built-in tools provided by OpenAI that execute tool calls and add results automatically to the conversation context. As model capabilities continue to evolve, we believe the Responses API will provide a more flexible foundation for developers building agentic applications.

New tools to help you build useful agents

Web search delivers accurate and clearly-cited answers from the web. Using the same tool as search in ChatGPT, it’s great at conversation and follow-up questions—and you can integrate it with just a few lines of code. Web Search is available in the Responses API as a tool for the gpt-4o and gpt-4o-mini models, and can be paired with other tools. In the Chat Completions API, web search is available as a separate model, called gpt-4o-search-preview and gpt-4o-mini-search-preview. Available to all developers in preview.

File search is an easy-to-use retrieval tool that delivers fast, accurate search results with a few lines of code. It supports multiple file types, reranking, attribute filtering, and query rewriting. File Search is available in the Responses API, plus continues to be available via the Assistants API.

Computer use is the fastest way to build computer-using agents with CUA, the same model that powers Operator in ChatGPT. You can use this tool to control computers or virtual machines that you operate. You simply pass screenshots to the tool, and it responds with an action you should take like click, scroll, or type. The model is available for select developers on tiers 3–5 as a research preview in the Responses API.

Agents SDK is an orchestration framework that abstracts the complexity involved in designing and scaling agents. It includes built-in observability tooling that allows developers to log, visualize, and analyze agent performance to identify issues and areas of improvement. Inspired by Swarm, the Agents SDK is also open source and supports both other model and tracing providers.

18 Likes

@edwinarbus please add :rocket: emoji :wink: . Congrats on the new launch.

3 Likes

New tools for building agents

https://openai.com/index/new-tools-for-building-agents/

How to use the OpenAI API for Q&A or to build a chatbot?

https://help.openai.com/en/articles/6643167-how-to-use-the-openai-api-for-q-a-or-to-build-a-chatbot

5 Likes

Do we know when/how the computer use model will be rolled out?

2 Likes

Since they didn’t say later, I expect the roll-out is already underway.

Staff will show up later/soon. They will likely clarify the timeline.

3 Likes

Link to the documentation for the Agents SDK:

5 Likes

The Responses API docs:

https://platform.openai.com/docs/api-reference/responses

2 Likes

Computer use agent example code, should give a decent intro to usage.

5 Likes

Based on your feedback from the Assistants API beta, we’ve incorporated key improvements into the Responses API. After we achieve full feature parity, we will announce a deprecation plan later this year, with a target sunset date in the first half of 2026.

RIP Assistants.

2 Likes

Still not showing up, I’m hitting computer-use-preview on the sample app and get this response:

"error": {
        "message": "The model `computer-use-preview-2025-03-11` does not exist or you do not have access to it.",
        "type": "invalid_request_error",
        "param": null,
        "code": "model_not_found"
    }
5 Likes

You can use models from other providers.

Using other LLM providers
Many providers also support the OpenAI API format, which means you can pass a base_url to the existing OpenAI model implementations and use them easily. ModelSettings is used to configure tuning parameters (e.g., temperature, top_p) for the model you select.

Since the OpenAI API has been adopted by many providers, it should give us a broad range of models to choose from right from the start. Since the SDK is open source, it’s just a matter of time until the rest will be available as well.

See the last paragraph:

5 Likes

Input and output pricing for computer use. Batch API is available, caching is not.

3 Likes

Lot of people in the community have been asking for web search via API and now we have it, both in the new Responses API as well as the agents SDK. Great!

7 Likes

Beautiful, thank you. For whatever reason I didn’t see the openai_client parameter for the Model. I should probably read the docs as well as the source code

1 Like

Yes, thanks for bringing it up.
Here are the prices for 1 million tokens for the search API.
No caching and no batch API available.

2 Likes

wow this is perfect, web search opens a lot of new usages! And with computer usage, we will finally see an explosion of new agents, but most important, we can now build our own mini-automations for specific projects. I’m very excited to try all these new toys!

3 Likes

Noteworthy: the server-side “conversation state” feature introduced with the responses API: if you use this feature, by passing a previous_response_id along with only the latest input, it has no cost management to the chat length:

You get only:

  • maximum loading of context window (pay 120k tokens a turn?), or
  • an error thrown.

If you get into the situation where it is automatically discarding oldest turns, it is also destroying any context window caching with every new input.

truncation: The truncation strategy to use for the model response.

* `auto`: If the context of this response and previous ones exceeds the model's context window size, the model will truncate the response to fit the context window by dropping input items in the middle of the conversation.
* `disabled` (default): If a model response will exceed the context window size for a model, the request will fail with a 400 error.

Thus, this conversation state is not practical to use in its current implementation unless you want to make chat sessions themselves of limited turn count length to users, or will set money on fire.


Additional: "store": false must be set by you when doing your own chat self-management as before, otherwise you get the server-side storage of chat session under a multitude of model response IDs as the default, consuming resources and perhaps response time.

Todo: compare network latency of 10-200kb chat requests to the latency of the backend response_id retrieval when this is being more heavily utilized - like assistants.

9 Likes

There seems to be an issue with API Keys on Responses endpoint.
In my specific case, I tried using a pre-existing API Key which was restricted but allowed most features except fine-tuning. I received a 403 error - not enough permissions. I switched the API key from restricted to Full, and tried again, no difference.
I then proceeded to create a NEW API Key, also with Full capabilities, and then it worked.
So there seems to be an issues with previously created Keys not being allowed to use Responses API.

1 Like

This is a phenomenal release! Probably the biggest developer release in a year. Love the simplicity of the new API also (as a relative simpleton myself!).

1 Like