Getting rate limit error in spite of small message to Assistant

I have an Assistant that I’m testing in the Playground area, sending it a very small request. The goal is to use Structured Outputs to format the output as JSON. When I try this I get a rate limit error below:

Request too large for gpt-4o in organization org-id on tokens per min (TPM): Limit 10000, Requested 16928. The input or output tokens must be reduced in order to run successfully.

This doesn’t make sense because I can send the same request to a different Assistant and it works fine.

Here’s my setup for the Assistant that’s giving me an error:

Instructions:
You are a detailed assistant that takes the user input returns it formatted as a JSON object using the json schema

Model: gpt-4o-2024-08-06

File search / code interpreter are off (they have to be to use the json_schema response format)
No functions

Response format json_schema:

{
  "name": "json_response",
  "strict": true,
  "schema": {
    "type": "object",
    "properties": {
      "tag_number": {
        "type": "string",
        "description": "The tag number of the current pump"
      },
      "qty": {
        "type": "string",
        "description": "The qty of the current pump"
      },
      "total_system_flow": {
        "type": "string",
        "description": "The total system flow of the current pump"
      },
      "environment": {
        "type": "string",
        "description": "The environment of the current pump"
      }
    },
    "additionalProperties": false,
    "required": [
      "tag_number",
      "qty",
      "total_system_flow",
      "environment"
    ]
  }
}

Temperature: 0.11
Top P: 1

Message that’s giving error:
format the following as a JSON object according to json_response: { “tag_number”: “CHWP-1~3”, “qty”: 3, “total_system_flow”: “550 USgpm” }

What could be causing this? I don’t think anything here would have so many tokens needed.

Any ideas for solving or troubleshooting?