I’m writing a function to display cost for each API call using tiktoken for learning purpose. I learned about tokens per message, which is fixed overhead tokens for each message that doesn’t depend on the length of the message content; For gpt-4o, this is 3 tokens.
Then I’m confused about how tokens are calculated. Since
OpenAI adds 3 tokens per message as overhead, while other APIs such as Anthropic’s Claude 3.5 Sonnet count the actual content.
Given this, wouldn’t GPT-4 be significantly cheaper in most cases? Am I missing something in this comparison?
- GPT-4o: $5/million input tokens, $15/million output tokens
- Claude 3.5 Sonnet: $3/million tokens (flat rate)