How can I write a POST request for Azure OpenAI that uses structured output?

I want to write a POST request for Azure OpenAI that uses structured output.

https://azure.microsoft.com/en-us/blog/announcing-a-new-openai-feature-for-developers-on-azure/ says:

Here’s an example API call to illustrate how to use Structured Outputs:

{
  "model": "gpt-4o-2024-08-06",
  "prompt": "Generate a customer support response",
  "structured_output": {
    "schema": {
      "type": "object",
      "properties": {
        "responseText": { "type": "string" },
        "intent": { "type": "string" },
        "confidenceScore": { "type": "number" },
        "timestamp": { "type": "string", "format": "date-time" }
      },
      "required": ["responseText", "intent", "confidenceScore", "timestamp"]
    }
  }
}

Regrettably, they don’t say how to use that API call. How can I write a POST request for Azure OpenAI that uses structured output?


I tried to use the Completion REST API:

import requests
import json
from datetime import datetime

# Set your Azure OpenAI endpoint and key
endpoint = "https://your-azure-openai-endpoint.com"
api_key = "your-azure-openai-key"
deployment = 'engine-name'
endpoint = f'https://{endpoint}/openai/deployments/{deployment}/completions?api-version=2024-06-01'

# Define the request payload
payload = {
    "model": "gpt-4omini-2024-07-18name",
    "prompt": "Generate a customer support response",
    "structured_output": {
        "schema": {
            "type": "object",
            "properties": {
                "responseText": { "type": "string" },
                "intent": { "type": "string" },
                "confidenceScore": { "type": "number" },
                "timestamp": { "type": "string", "format": "date-time" }
            },
            "required": ["responseText", "intent", "confidenceScore", "timestamp"]
        }
    }
}

# Send the request
headers = {
    "Content-Type": "application/json",
    "api-key": f"{api_key}"
}

response = requests.post(endpoint, headers=headers, data=json.dumps(payload))

# Handle the response
if response.status_code == 200:
    response_data = response.json()
    print(json.dumps(response_data, indent=4))
else:
    print(f"Error: {response.status_code}")
    print(response.text)

However, I get the error completion operation does not work with the specified model:

{"error":{"code":"OperationNotSupported","message":"The completion operation does not work with the specified model, gpt-4o-mini. Please choose different model and try again. You can learn more about which models can be used with each operation here: https://go.microsoft.com/fwlink/?linkid=2197993."}}

So I guess I should use another REST API, but which one?