Image Interpretation - Are token limits practical?

POC: I’ve successfully created a custom GPT through the web interface using GPT-4 which is highly effective in interpreting meal images and determining the ingredients among other things. So the proof is that openai is capable of what I need to implement.

Now attempting to implement this in my own code using the OpenAI APIs has hit a bottle neck. When I pass the image_url with my base64 image thought the user message into the payload and post, I receive the response:

Request too large for gpt-4-turbo-preview in organization org-{mine} on tokens per min (TPM): Limit 30000, Requested 37349. The input or output tokens must be reduced in order to run successfully. Visit https://platform.openai.com/account/rate-limits to learn more

This is even after resizing the image down considerably.

Resizing below say 600px wide for example on most images gets around this issue with the tokens below the 30k limit, yet this presents another issue:

The image is too low in resolution for the analysis to be accurate. The response itself often replies with this complaint or simply returns inaccurate information.

I’ve tried a variety of models that support image analysis and even those that don’t without any success.

So my question is:

Am I missing something? I tried editing the limits to higher values in the organisation setup.

I’ve only recently setup my account so even though I’ve added Credits I’m apparently on the ‘Free tier’ and the next tier is ‘usage tier 1’.

Do I have to wait or use this up first? How can I advance to a practical level of usage if I want to get his app out in 1 weeks time?

Thanks in advance for your thoughts and what you’ve experienced.

PS: I’m aware of other questions related eg: “hitting-rate-limit-on-gpt-4-vision-preview-with-first-query/479464” but my question is how to speed up this process. I’m wanting to use the platform commercially but these seems to be a barrier to real use-cases?

PS2: I just took out the image and the prompt_tokens have dropped considerably.

usage: { prompt_tokens: 176, completion_tokens: 46, total_tokens: 222 },

There is something wrong with how the tokens are being calculated for my image.
Here’s the javascript that brings in the image:

      const response = await fetch(
        'https://www.domain.com/pesto-chicken-omelette-3.jpg'
        , {mode: 'no-cors'});
 
      const imgBuffer = await response.buffer();
      const resized = await sharp(imgBuffer).resize({width: 600}).toBuffer();  
      const base64image = `data:image/jpg;base64,`+resized.toString('base64');
     
      console.log('Response Blob ', base64image );

Now the chat completions:
Note: I’m using the official npm package openai


    const completion = await openai.chat.completions.create({
      model: 'gpt-4-turbo',
      temperature: 0.6,
      messages: [
     <my under 10 lines of system, content rules>
       { role: 'system', content: base64image }, // the user provided image
        ]
      });

According to the token calculatores, this should not be exceeding 30K tokens. It should be under 700.

What happens if you format your message this way?

const completion = await openai.chat.completions.create({
    model: "gpt-4-turbo",
    messages: [
      { role: "system", content: "your under 10 lines of system, content rules" },
      {
        role: "user",
        content: [
          {
            type: "image_url",
            image_url: {
              url: base64image
            }
          },
        ],
      },
    ],
  });
3 Likes

I can confirm that an image request takes a huge “bite” out of your rate limit.

3 tokens of system message, 1 token of user message text, 472+476 bytes of base-64 encoded PNG at detail:low for 170 tokens, max_tokens=10:

{'prompt_tokens': 185, 'completion_tokens': 10, 'total_tokens': 195}
(PNG files were 354, 356 bytes)

gpt-4-turbo response:
The images display the words "Apple" and "
Headers:
'x-ratelimit-limit-tokens': 2000000
'x-ratelimit-remaining-tokens': 1998451

Remaining tokens cut by 1549 tokens on 195 tokens usage.


def make_an_example():
    # ...stuff omitted
    # Prepare the body of the HTTP POST request that will be sent by the requests module
    user_content = [
            {
                "type": "text",  
                "text": user_message
            },
    ]

    for image in images:  # you would need to change the mime type for other than png
        image_entry = {
            "type": "image_url",
            "image_url": {"url": f"data:image/png;base64,{image}",
                          "detail": "low"}
        }
        user_content.append(image_entry)

    http_body_dictionary = {
        "model": model,  # This is the AI model compatible with the Chat Completions endpoint
        "top_p": top_p,  # Nucleus samping: Range 0.0-1.0, where lower number reduce bad tokens
        "max_tokens": max_tokens,  # This sets the maximum response the API will return to you
        "messages": [  # This is the list of messages format that Chat Completions requires
            {
                "role": "system",  # A system message gives the AI its identity, purpose
                "content": [
                    {
                        "type": "text",
                        "text": system_message
                    },
                ]
            },
            {
                "role": "user",
                "content": user_content
            }
        ],
    }

The maximum useful image you can send with this message format will be biggest dimension <= 2048, smallest dimension <= 768. Resize to the best the AI will see.
(You probably should get a rate limit if sending 20MB images for the AI to see 2MB.)

1 Like

Thanks @supershaneski & @_j
I’ll back to this later today and try both of your suggestions. I’m working in Node/Typescript so I’ll see if I can just bypass the SDK and post directly to the API . My current theory now is the the OpenAI is seeing my image data as just a really long message instead of an image. So I suspect that is why it’s taxing so many tokens. If I can post directly to the API specifically in the Image structure then the issue should be resolved. I’ll update you after my attempts.

1 Like

The workaround you can do is not to send the exact same request using different libraries or your own that would send the same JSON body.

It would be to have an image server that URL can access, thus possibly keeping the image data out of the rate limit analysis. (that is something I can try in a bit)


Edit: the impact of uploading my images to a web server and referring to the URL on the rate limit is identical.

  • BASE64 uses no more rate limit.
  • The image size at detail:low doesn’t matter to the rate limit, whether 1k or 150k.