Image based inference to gpt4o appears to be severely impaired due to excessively naive token counting

_j · October 28, 2024, 1:06am

The quality is not impaired by the rate limit mechanism.

The rate limiter is a simple estimate. The rate limiter is like a firewall. The only function is to block API requests from reaching AI models if the limit set or the limits of an organization are exceeded.

The language tokens are an estimate that are close but not the actual amount.
The images have a fixed rate consumption regardless of any settings: 771 tokens per image

Because it must block excessive requests, neither deep inspection nor advanced computation is used on the API request. That means that yes, some requests impact accumulated rate more than their AI consumption, while others have a lesser impact than true usage.

Despite all your posted evidence for some unknown theory, the rate limiter and you being rejected for exceeding it has nothing to do with the quality of vision or output.

When you are making requests correctly, in many cases, you can send multiple images in a single user message to obtain unique classifications or descriptions. The quality degrades after five or ten.

Topic		Replies	Views
Error processing requests with images attached to them in gpt-4o API	2	175	July 4, 2024
Max_tokens not set, truncated return with "finish_reason": "stop" API gpt-4 , api	9	4917	April 24, 2024
Structured Outputs sometimes failing due to "Could not parse response content as the length limit was reached" API api , structured-output	2	1078	March 25, 2025
Gpt-4o-mini fails with multiple images in same code that works with gpt-4o API	7	536	August 3, 2024
[Azure OpenAI - Structured Output] "Bad request: Invalid response_format provided." for a json schema that was working before Bugs gpt-4	0	56	May 1, 2025

Image based inference to gpt4o appears to be severely impaired due to excessively naive token counting

Related topics