How do I calculate image tokens in GPT4 Vision?

Does anyone know anything about how the blank space is handled in tiles which only partially cover an image? For example from the diagrams in this blog post (OpenAI Visual Tokenizer Explained | by Tee Kai Feng | Medium) we can see that there will be “blank” space inside tiles which don’t fully cover the image

1 Like