Did I do something wrong? :blankface:
My usage indicates an unbelievable number of tokens consumed with 4 requests.
Local time: Jun 8, 2023 at 3:40 PM
text-embedding-ada-002-v2, 4 requests
5,543,058 prompt + 0 completion = 5,543,058 tokens
Please let me know if this is correct.
What did you convert to vectors?
Or what kind of “service” were you using?
It says “text-embedding-ada-002-v2” which is a embedding model used to encode and decode data to and from vectors.
I hope this helps you
I was encoding the LangChain documentation for later similarity search. I had another report indicating over 11M tokens in 8 queries.
No way this number of tokens was consumed. Chunk sizes were from 200 to 2000. At 200 chunk size there are only 3400 docs encoded.
Well it’s saying roughly 5 million tokens used, that’s on (very rough) average 4 characters per token so about 20 Megabytes (+/-5Mb) so, if you check the documentation size, that should roughly match up.
If the text contains a lot of code then it can be significantly lower than 4 per token down to 1.5-2… so the documentation could potentially be in the 10 Megabyte range.
Without knowing you actual code we cannot be certain but I have read accounts from others that experienced high usage because their code was doing looping endlessly.
The docs are roughly 25MB so not far from your estimate.
I ran the embeddings several times trying different vector stores and different chunking. I guess I lost track of how many times it requested the embeddings. I have several usage entries just over 5M tokens and one over 11M tokens. Certainly seems in the right ball park for what I was doing.
Thanks to all for looking.
good news is I can now do Q&A with LangChain documentation. :^)