Help Needed! Unrealistically huge prompt numbers costing me all my money

Ok. Ive created a simple chatbot. When I input a question. “What are your hours on Tuesday?” per say, the intent recog function recognises its a ‘store_hours’ intent, so runs that function. Inside that function openai iis used to create a response with the info in the function. At first tokens were:
gpt-3.5-turbo-0613, 1 request
7 prompt + 1 completion = 8 tokens

then for some unknown reason, they sky rocketed.

gpt-4-0613, 5 requests
28,561 prompt + 1,712 completion = 30,273 tokens

gpt-4-0613, 8 requests
48,023 prompt + 2,451 completion = 50,474 tokens

gpt-4-0613, 10 requests
58,301 prompt + 3,804 completion = 62,105 tokens

gpt-4-0613, 10 requests
46,408 prompt + 2,728 completion = 49,136 tokens

gpt-4-0613, 6 requests
39,553 prompt + 1,734 completion = 41,287 tokens

gpt-4-0613, 9 requests
41,756 prompt + 2,772 completion = 44,528 tokens

gpt-4-0613, 10 requests
59,351 prompt + 3,239 completion = 62,590 tokens

gpt-3.5-turbo-0301, 1 request
11 prompt + 2 completion = 13 tokens

gpt-4-0613, 8 requests
36,943 prompt + 1,908 completion = 38,851 tokens

gpt-4-0613, 9 requests
42,677 prompt + 2,379 completion = 45,056 tokens

gpt-3.5-turbo-16k-0613, 1 request
7,483 prompt + 292 completion = 7,775 tokens

gpt-4-0613, 7 requests
41,194 prompt + 2,713 completion = 43,907 tokens

gpt-3.5-turbo-16k-0613, 2 requests
13,588 prompt + 495 completion = 14,083 tokens

gpt-4-0613, 10 requests
36,839 prompt + 2,736 completion = 39,575 tokens

gpt-4-0613, 9 requests
46,904 prompt + 3,126 completion = 50,030 tokens

gpt-3.5-turbo-16k-0613, 1 request
7,122 prompt + 235 completion = 7,357 tokens

gpt-4-0613, 10 requests
43,811 prompt + 3,097 completion = 46,908 tokens

gpt-3.5-turbo-16k-0613, 1 request
7,494 prompt + 342 completion = 7,836 tokens

If anyone could give me a clue into what is happening here, that would be great, as Ive been putting $20 lots in and when I stasrt the app to use it in the morning Im getting the “ERROR:root:Error: You exceeded your current quota, please check your plan and billing details.” error in the terminal. I stopped the app overnight from running also to make sure money wasnt getting thrown down the drain.

Welcome to the community @s.hutcheson06

In my experience this is an issue with how you’re consuming the API in your code. It will be better solved with the code that’s dealing with messages, function definition & calling and API calls.

1 Like

GPT-4 is being used.

You likely used your API key in some site or extension designed to steal your key and abuse it for generating text (like making Chinese clone AIs).

Make a new API key and delete all the others. Only use your API key in your own software located on servers under your control.

Option 2: you used software like langchain with no idea how to use it.

2 Likes

This could also be a possibility @s.hutcheson06 . You’re keeping the API key on your secure system and not on the client side code. Right?

Or you have built a feed back loop for context that you are not clearing out.

I have done this easy to do if you want context

Check it out a page of text is the limits of tokens 8192 for 4.0

What else are you feeding your calls? 4.0 is expensive do you really need those calls?

Hi

It seems you’re witnessing a sudden surge in token usage with your chatbot. One potential explanation might be that you’re possibly passing the entire conversation history to the OpenAI API with each request. If this is the case, as the conversation expands, the token consumption would proportionally increase. Additionally, if you’re using multiple functions, each one adds to the overall token count, especially if all functions are added to the prompt to help the model determine which one to use. It might be worth considering both monitoring and managing the length of the conversation history and also reducing the number of functions used. Try streamlining your functions and testing to see if that impacts the token consumption.

Maybe…

  1. Your token has been hacked (Do not store the token on the front-end side).
  2. Your app is stuck in a loop and keeps sending requests continuously (Print logs, run some tests before release, and consider logging just before sending a request to OpenAI).

This is now the “random guesses for someone who hasn’t been seen since posting” topic.

You don’t get GPT-4 maxed out at 8000 tokens a dozen times per 5 minute intervals until an account is emptied by an “oops, too much gpt-3.5 conversation…”