Hi all,
I’m using the OpenAI Agents SDK to define and orchestrate multi-agent interactions across specific tasks and steps for a larger workflow. The new o3 and o4-mini models are primarily used.
The recent announcement about Flex processing would be really advantageous for this workflow since it’s not dependent on real-time responses and I’m keen to leverage the cost savings.
The documentation shows enabling Flex by setting service_tier=“flex” directly in the base client call, like response = client.with_options(timeout=900.0).responses.create(…, service_tier=“flex”).
However, when using the agent abstraction layer, agents are roughly defined like:
my_agent = Agent(
name="some_agent",
instructions=...,
model="o3", # or o4-mini
output_type=...
)
# And run them via something like:
Runner.run(my_agent, ...)
I haven’t found an obvious parameter in the Agent(…) definition or the Runner.run(…) call to pass the service_tier=“flex” setting through to the underlying OpenAI API call.
Is there a currently supported method or best practice for enabling Flex processing when using the agent frameworks/SDKs like this? Is it perhaps controllable via a configuration setting I missed, or would this require explicit support within the abstraction layer itself?
Thanks for any insights or guidance!