I wanted to know how is token counted when I use retrieval feature?
I have a really long document.
The token usage is completely impenetrable.
The AI might be loaded with maximum conversation thread that you are not given a report of, and then also maximum embeddings from documents you are not given a report of.
Then if you’re lucky it doesn’t call functions or code interpreter over and over.
The only way you’d discover the use is to make only one “run” per UTC day, and then see what you get billed to the new usage after it is added another day later.
The size of the document doesn’t matter, unless it is so small that the AI context isn’t filled with document chunks for any question.
Then this is clouded by the per-assistant data storage bill and the per-session code interpreter bill.
That’s very unclear for users and OpenAI is being non-responsible for that!