Based on the documentation for a proper conversation we have to send a list of messages repeatedly. I would like to know how we can count the used token in this case for cost estimation.
- Let’s say my system message/prompt is 1k tokens.
- The 1st user message is 1k tokens
- The response is 1k tokens as well from the bot
Then this means we’ve used 3k tokens with the first message-response pair.
For the 2nd user message, we need to send the entire previous 3k tokens and also the new message (which is 1k tokens again) and the response is 1k tokens as well.
Now we spent 3k + 2k = 5k tokens for the 2nd message pair.
- The whole session was costing us 3K + 5k tokens, am I right?
- In real life as the maximum input is 4096 tokens, we would get an error for the 2nd message, am I right?
Can you give me the formula for the calculation?