I have an Assistant that I’m testing in the Playground area, sending it a very small request. The goal is to use Structured Outputs to format the output as JSON. When I try this I get a rate limit error below:
Request too large for gpt-4o in organization org-id on tokens per min (TPM): Limit 10000, Requested 16928. The input or output tokens must be reduced in order to run successfully.
This doesn’t make sense because I can send the same request to a different Assistant and it works fine.
Here’s my setup for the Assistant that’s giving me an error:
Instructions:
You are a detailed assistant that takes the user input returns it formatted as a JSON object using the json schema
Model: gpt-4o-2024-08-06
File search / code interpreter are off (they have to be to use the json_schema response format)
No functions
Response format json_schema:
{
"name": "json_response",
"strict": true,
"schema": {
"type": "object",
"properties": {
"tag_number": {
"type": "string",
"description": "The tag number of the current pump"
},
"qty": {
"type": "string",
"description": "The qty of the current pump"
},
"total_system_flow": {
"type": "string",
"description": "The total system flow of the current pump"
},
"environment": {
"type": "string",
"description": "The environment of the current pump"
}
},
"additionalProperties": false,
"required": [
"tag_number",
"qty",
"total_system_flow",
"environment"
]
}
}
Temperature: 0.11
Top P: 1
Message that’s giving error:
format the following as a JSON object according to json_response: { “tag_number”: “CHWP-1~3”, “qty”: 3, “total_system_flow”: “550 USgpm” }
What could be causing this? I don’t think anything here would have so many tokens needed.
Any ideas for solving or troubleshooting?