I am testing the capability of GPT-4o (Azure OpenAI) on image captioning by using the Python API. According to the OpenAI guide here, the token needed to process image with "detail": "low"
should take 85 of tokens. However, my experiment shows that only 73 tokens are being used. Am I missing something here?
This is part of the output from the GPT-4o response. The prompt_tokens
shows that only 73 of tokens are consumed (I only sent the image with no other prompt).
ChatCompletion(..., model='gpt-4o-2024-05-13', ..., usage=CompletionUsage(completion_tokens=97, prompt_tokens=73, total_tokens=170),...)