Expected Behavior:
When calling client.updateSession({tool_choice: {type: "function", name: "list_emails"}})
, the function list_emails should be invoked correctly, similar to the behavior when using client.updateSession({tool_choice: "auto"})
. This includes:
- A realtime.item of type function_call being generated with the correct arguments.
- A function_call_output type item returned with the processed data.
- Audio output in response to the function’s execution.
Here is an example for the working case using (tool_choice: "auto")
:
Initial User Message:
{
"id": "item_AY81IkN6zZ0We2KwXkd8B",
"object": "realtime.item",
"type": "message",
"status": "completed",
"role": "user",
"content": [
{
"type": "input_text",
"text": "Hello! Check my last email and reply to it."
}
],
"formatted": {
"audio": {},
"text": "Hello! Check my last email and reply to it.",
"transcript": ""
}
}
AI Audio Response:
{
"id": "item_AY81IaiSORg9pH7Vr0WEr",
"object": "realtime.item",
"type": "message",
"status": "completed",
"role": "assistant",
"content": [
{
"type": "audio",
"transcript": "Sure! I'll check your latest email and get a reply ready. Please give me a moment."
}
],
"formatted": {
"audio": {},
"text": "Hello! Check my last email and reply to it.",
"transcript": ""
}
}
Function Call:
{
"id": "item_AY81KpMatEHuAxWRNIwLa",
"object": "realtime.item",
"type": "function_call",
"status": "completed",
"name": "list_emails",
"call_id": "call_I14Ukrd5RS2gtfY8",
"arguments": "{\"numEmails\":1}",
"formatted": {
"audio": {},
"text": "",
"transcript": "",
"tool": {
"type": "function",
"name": "list_emails",
"call_id": "call_I14Ukrd5RS2gtfY8",
"arguments": "{\"numEmails\":1}"
}
}
}
Function Call Output:
{
"id": "item_AY81MM9OwvmT4v8KAuQbx",
"object": "realtime.item",
"type": "function_call_output",
"call_id": "call_I14Ukrd5RS2gtfY8",
"output": " [data retrieved] ",
"formatted": {
"audio": {},
"text": "",
"transcript": "",
"output": " [data retrieved] "
},
"status": "completed"
}
Incorrect Behavior
using client.updateSession({tool_choice: {type: "function", name: "list_emails"}})
Initial User Message:
{
"id": "item_AY8B7XZfNVik3T9v6vJ7C",
"object": "realtime.item",
"type": "message",
"status": "completed",
"role": "user",
"content": [
{
"type": "input_text",
"text": "Hello! Check my last email and reply to it."
}
],
"formatted": {
"audio": {},
"text": "Hello! Check my last email and reply to it.",
"transcript": ""
}
}
Argument Passed as message instead of function_call (numEmails = 1):
{
"id": "item_AY8B7KuSqtPRwh1Y7HStO",
"object": "realtime.item",
"type": "message",
"status": "completed",
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"numEmails\":1}"
}
],
"formatted": {
"audio": {},
"text": "{\"numEmails\":1}",
"transcript": ""
}
}
Repeated Argument Item (Still Incorrect Type):
{
"id": "item_AY8B8DN1mW7Rgp0r3dEbY",
"object": "realtime.item",
"type": "message",
"status": "completed",
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"numEmails\":1}"
}
],
"formatted": {
"audio": {},
"text": "{\"numEmails\":1}",
"transcript": ""
}
}
Missing Audio Output:
Unlike the auto case, no audio output is generated.
Summary of the Issue:
When forcing a function call with client.updateSession({tool_choice: {type: “function”, name: “list_emails”}}):
The function_call type is not used for the generated realtime.item.
The arguments appear as message type instead of being passed correctly as a function_call type.
No audio output is generated, contrary to the behavior with tool_choice: “auto”.
I’m using the openai-realtime-api-beta typescript library from GitHub - openai/openai-realtime-api-beta: Node.js + JavaScript reference client for the Realtime API (beta)
How can I fix that?