Query regarding GPT4 max tokens and pricing

Can you please help with this scenario-based queries –

  • Model: GPT4
  • Input Tokens: 100
  • Max. Tokens: 500

Case A:

Output Tokens: 300 (within max tokens)

Questions:

  1. How much tokens will be charged for, 100 + 300 OR 100 + 500?

Case B:

Output Tokens: 800 (beyond max tokens)

Questions:
2. How much tokens will be charged for, 100 + 800?
3. Please confirm whether we get the entire 800 tokens in response

In both cases given above, assume we’re well within model’s context window. My understanding is the response will be truncated when it exceeds the context window.

1 Like

You’re only charged for what you use, so in this case, it would be 100 + 300.

If you set max_tokens, it’ll never output 800 tokens.

Again, though, you’re just charged for tokens in and tokens out. Some models have two prices (one for input and one for output…)

1 Like