I’m building an agent using the OpenAI Agents SDK, and it uses multiple tools. These tools fall into two categories:
-
User-facing tools – Their execution impacts the final response the user sees.
Example: Google Calendar tool that books a slot and confirms it to the user. -
Internal tools – These run backend actions that don’t affect the user’s immediate response.
Example: After booking a calendar event, I want to store the booking info in a database using another tool.
The problem is:
Because internal tools are part of the main agent, the agent tries to select and run both types of tools in the same reasoning chain. This delays the final user response, since internal actions still add to the total execution time.
What I want
I want the final response to the user to be sent immediately, and any internal tools should run without blocking or delaying the main reply — almost like background or post-response actions.
My Question
What’s the right architecture/design pattern for handling this?
-
Should I separate internal tools into another agent?
-
Is there a recommended pattern in the Agents SDK for non-blocking or deferred tool execution?
Any advice or examples on how to structure this cleanly would be really helpful!