I’m using LlamaIndex - this library uses the in-context learning strategy. I found that a pretty short question with a one-paragraph answer may cost 3-4 thousand tokens. Is it normal or too many?
Thanks!
Welcome to our growing dev community.
I mean, I guess it depends on your definition of “normal usage”? I’m not familiar with LlamaIndex, but “in-context learning strategy” hints that they might be adding more content around the “short” question which is where it ends up 3k or 4k.
Hi!
Thanks for your response.
I’m not familiar with LlamaIndex, but “in-context learning strategy” hints that they might be adding more content around the “short” question which is where it ends up 3k or 4k.
I guess they do but why does ChatGPT count them? What is actually behind the “usage” tokens by ChatGPT? At first, I thought it was just amount of tokens that the answer consists of but now I see it includes something else and I wonder what is that.
Are you talking about the API and using the GPT-35-turbo model? OR are you using a ChatGPT plug-in? Where are you seeing the token count?
Token count includes the prompt AND response. You can play around with this to get a sense of how many tokens a given piece of text is: OpenAI API
I’m using ChatGPT API via LlamaIndex. LlamaIndex counts spent tokens and I can grab this information. It doesn’t make a breakdown, though.
Yeah, I know this tool. But to use it I need to know exactly what I should put there. I think it’s probably because of a prompt that is passed as “in-context” learning data. Thanks.
The large amount of token usage comes from using in-context feature I think. Any language model- ChatGPT or others, take your query and joins some paragraphs for context. Assuming the same is happening in in-context feature there. The input is the joint string, and that should explain the large token usage.