indeed, the number of tokens counted in usage seems to correspond to what the tokeniser indicates. it seems that the number of tokens explodes because of the French language, as there are a lot of accents. For example, the text tested here is 159 words long for 277 tokens, which makes a ratio of +74%.
This is enormous and the increase in the number of French users on my app means that my costs are skyrocketing, even though I had calibrated my rates on the basis of 25%.
Too bad, I’ll deal with it.
Thanks for your answers
