Audio input not working when migrating from completions to responses

I am migrating my chat from the old Chat Completions to Responses API. I have followed the documentation for creating a response carefully and everything appeared to work fine except when I send audio which results in the following error:

{
  “error”: {
    “message”: “Invalid value: ‘input_audio’. Supported values are: ‘input_text’, ‘input_image’, ‘output_text’, ‘refusal’, ‘input_file’, ‘computer_screenshot’, and ‘summary_text’.”,
    “type”: “invalid_request_error”,
    “param”: “input[0].content[1].type”,
    “code”: “invalid_value”
  }
}

My outgoing message is the following:
POST https://api.openai.com/v1/responses

{
    "model": "gpt-audio-2025-08-28",
    "input": [
        {
            "content": [
                {
                    "type": "input_audio",
                    "input_audio": {
                        "data": "...data..here...",
                        "format": "wav"
                    }
                }
            ],
            "role": "user"
        }
    ]
}

It has strictly followed the create response documentation describes the following:

When I revert back to the Chat Completions API and send the following body using the same model, everything appears fine, hence the model, my API key and my IP are valid:
POST https://api.openai.com/v1/chat/completions

  {
    "model": "gpt-audio-2025-08-28",
    "messages": [
        {
            "content": [
                {
                    "type": "input_audio",
                    "input_audio": {
                         "data": "...data..here...",
                        "format": "wav"
                    }
                }
            ],
            "role": "user"
        }
    ]
}

Any help on this would be appreciated!

1 Like

The responses api doesn’t support audio yet:

source

But it is interesting to see that the docs were updated, perhaps it is coming soon?

Over a month of “input_audio” being documented but denied, and also audio events in the OpenAPI spec, but no method of passing “modalities” or “voice” that would be also required.

Thankyou for the response. I will await for the Audio API to be released!