Best Model for Agent Triage (GPT-5 mini/nano) and Passing Messages/Context

jim · August 7, 2025, 10:23pm

Wondering if anyone has any experience with finding out the best, super quick smartest agent for doing the initial “triage” agent who then handoffs to others?

In the past I was using gpt-4-mini and felt pretty good about it - now trying out gpt-5-mini, but with options like reasoning, verbosity, etc. - wondering if there is some best practices to follow.

Also, I’m noticing with 5-mini its better at passing forward messages and context in a transfer_to_agent_name function–but it’s sometimes passing the entire user message in either message or context (even after trying to avoid it in the instructions for that transfer).

As I’m using a homegrown agents “SDK” (not the official ones) - is it normal for this to happen - in the Agents SDK do they have this message/context filled up as well as encouraging previous_response_id in there as well? Or is it a thing that once you hand-off you drop the previous response?

thanks in advance!

_j · August 7, 2025, 11:20pm

I would use a structured output for capturing a guardrail or disposition.

A function is optional, by AI model choice, but this is a mandatory job.

The function schema placement communicates more clearly and doesn’t rely on post-training of what the function tool format given to the AI means.

I don’t understand the “passing messages” you are attempting. You could make a strict structured output that only accepts a number property in an array, and have the AI give “index numbers of important chat related to the latest question” based on your shown message number (as an example of an application paying more than embeddings.)

Right now is a pretty bad time to figure out what model can do a permanent “super quick”, but you can start evaluating “smartest” (where I can give you an expensive answer).

Performance of small in to small out

Model	Trials	Avg Stream Latency (s)	Avg Rate (tokens/s)
gpt-4.1-mini	10	0.890	7.386
gpt-5-nano	10	1.166	2.138
gpt-4o-mini	10	1.052	6.553
gpt-5-mini	10	1.211	5.671

Unique responses for gpt-4.1-mini (by first 60 chars):

10 | The capital of France is Paris.

Unique responses for gpt-5-nano (by first 60 chars):

9 | Paris.
1 | Paris is the capital of France.

Unique responses for gpt-4o-mini (by first 60 chars):

10 | The capital of France is Paris.

Unique responses for gpt-5-mini (by first 60 chars):

10 | The capital of France is Paris.

jim · August 7, 2025, 11:35pm

In my efforts to dupe the Agents SDK, I followed their strategy of creating a function called transfer_to_name_agent that was an object with both message and context in it.

Prior to GPT-5 these two works be kind of summarized, but now they are quite chock full of tokens!

Topic		Replies	Views
Challenges that I'm currently working on in the development of AI Agents (maybe you can help) API chatgpt	2	234	December 28, 2025
More consistent tool calling for GPT-5 API gpt-5	7	1578	October 7, 2025
Custom Java System vs Assistants API—Seeking Advice on Dynamic AI Agents, Training, and Token Efficiency API gpt-4 , chatgpt , fine-tuning , api , assistants-api	2	175	December 28, 2025
Who is gpt-4o agent successor in the gpt-5 serie? API azure , azure-openai	2	211	October 10, 2025
Tool Use Differences Between gpt-oss-20b and o3-mini in Multi-Agent Setup Open Models	10	981	September 29, 2025

Best Model for Agent Triage (GPT-5 mini/nano) and Passing Messages/Context

Unique responses for gpt-4.1-mini (by first 60 chars):

Unique responses for gpt-5-nano (by first 60 chars):

Unique responses for gpt-4o-mini (by first 60 chars):

Unique responses for gpt-5-mini (by first 60 chars):

Related topics