Help on pricing chat bot?

System Prompt User BOT Total
Conversation Number from 8-26 to 8-28 data Completion Message, User and Bot System Prompt Tokens user_word_count Total User Words Compounded per Conversation Tokens assistant_word_count Total Bot Words Compounded per Conversation
1.33 1.33
3 1 1478 9 9 12.0 78 78 103.7
3 2 1478 4 13 17.3 65 143 190.2
3 3 1478 11 24 32.0 91 234 311.2
3 4 1478 11 35 46.7 82 316 420.3
3 5 1478 9 44 58.7 81 397 528.0
3 6 1478 7 51 68.0 77 474 630.4
3 7 1478 11 62 82.7 94 568 755.4
3 8 1478 15 77 102.7 60 628 835.2
3 9 1478 30 107 142.7 127 755 1004.2
3 10 1478 15 122 162.7 79 834 1109.2
14,780 725 5,888
$ 0.03 $ 0.03 $ 0.06
$ 0.44 $ 0.02 $ 0.35 $ 0.82

does this seem accurate? I didnt run these numbers but it seems like the bot tokens are a little high. Typically its my prompt + user tokens that are really high and not the completion tokens.

Is the bot tokenage skewed too high in this?

I have a chat bot that uses gpt-4, trying to figure a monthly subscription. i was thinking 7 dollars but now i dont know , if this is correct ill have to rethink.

But for instance this is actual data

1:00 PM

Local time: Sep 1, 2023, 9:00 AM

gpt-4-0613, 2 requests

3,149 prompt + 283 completion = 3,432 tokens

1:05 PM

Local time: Sep 1, 2023, 9:05 AM

gpt-4-0613, 4 requests

8,638 prompt + 470 completion = 9,108 tokens

1:10 PM

Local time: Sep 1, 2023, 9:10 AM

gpt-4-0613, 5 requests

12,945 prompt + 666 completion = 13,611 tokens

1:25 PM

Local time: Sep 1, 2023, 9:25 AM

gpt-4-0613, 2 requests

3,794 prompt + 301 completion = 4,095 tokens

1:30 PM

Local time: Sep 1, 2023, 9:30 AM

gpt-4-0613, 1 request

2,202 prompt + 79 completion = 2,281 tokens

1:35 PM

Local time: Sep 1, 2023, 9:35 AM

gpt-4-0613, 1 request

2,290 prompt + 89 completion = 2,379 tokens

So is my accounant calculating the bot tokens too high?

feel free to ignore this if you want because i understand maybe its a frustrating question as my account used hypothetical numbers

Actually wait

So is it that,

the bot sends a completion,

adn than in the next message the user sends everytrhing back, including the bots history, but all that counts as prompt tokens, not completion.

so the completion tokens dont compound - they just compound as theyre sent back in the prompt tokens

am i right?

  1. Stop even counting words. It’s completely meaningless and only confuses things. There’s no such thing as a “word.”
  2. You have all of the prompts and responses, you should be able to just see exactly what is happening.
1 Like

I know man Im sorry im sorry, can u answer my last qeuestion there?

Yes.

If you have a chatbot, you’re likely sending the entire conversation every request in order to get a new response.

So,

Message 1

System: You are a helpful assistant.
User: Hi

Response 1

Assistant: Hello, how can I help you today?

Message 2

System: You are a helpful assistant.
User: Hi
Assistant: Hello, how can I help you today?
User: Can you explain how conversation history works for an LLM based chatbot?

Response 2

Assistant: Certainly! Conversation history…

Message 3

System: You are a helpful assistant.
User: Hi
Assistant: Hello, how can I help you today?
User: Can you explain how conversation history works for an LLM based chatbot?
Assistant: Certainly! Conversation history…
User: Does that mean that token use grows exponentially as the conversation goes on?

Response 3

Assistant: Yes.

2 Likes

So does it send the system message every time with the user? or does it send that one time per conversation?

this is a fantastic response, could you edit it to explain the system prompt as well? then i would mark as solution cuz no one else would ever have to ask again on this forum haha

Think in terms of requests you are making to the API. Not conversations. Each request you are likely sending: system message, past message history, latest user message. You’re charged for the tokens for that system message, message history, user message, and the response on each request.

Right thats what I thought,

so if im on message ten - that it has sent my system prompt ten times correct?

In a traditional chatbot application, yes very likely you would send the system message with every request. So after 10 messages you’d have sent the system message 10 times. And the 10th message may have all of the previous user and assistant messages as well (depending on token limits).

Okay, so how can i prevent my system message from compounding so much, how could I limit this? thats basically what is killing our usage

Done. Yes, the system message goes out with every API call.

1 Like

There’s no free lunch. If you need the system message to get the AI to behave how you want, then you need to send it on each request. This is where the experimentation/art comes in to play. Experiment with changing your system message. Is the full thing needed? Can you reduce it and still get the behavior you want?

  1. Use a smaller system message.
  2. Create a fine-tuned model that accomplishes the same effect as using the system prompt.
1 Like

I have marked your previous response as the solution!

1 Like

We are planning on doing that, just have to get the data first.

Okay great, thank you