When the API responds that you’ve exceeded the maximum token length for a specific model do we get charge for number of tokens in the request prompt and the response tokens that did not get returned? Or do we get charged just for the request prompts? Or do we not get charged at all when the error occurs?
The request never makes it to an AI model, it is stopped by the tokenizer and endpoint. So it makes no sense to be billed more than any other input error you created. – otherwise I’d be on the hook for 12345678 max_tokens
ps: I’ve got a new billing game to play: minimum input, maximum output. Only costs $0.07 if you win.