Hi !
For an app generating stories, I am using open-ai playground.
I created an assistant, programmed its system prompt : it works perfectly.
But I noticed a lot of tokens used in the “usage” page. So I am wondering : when we create a new thread of an assistant, do we pay for the system prompt programmed within the assistant as “input tokens” or is it really free and we only pay for the user’s input and its response ?
I am also wondering if it is the right choice to use an assistant, as there are no discussions, just one response (the story generated), and then the thread ends.
Damn it I should remember this…
Someone will correct me but its basically
where “U” equals user
U1 + System prompt + response = completion 1
U1 + U2 + response1 + system prompt+ response 2 = completion 2
U1 + U2 + U3 +response 1 + response 2 + system prompt + response 3 = completion 3
So basically the bot outputs its response to you, and then that same response is sent back again the next message, but as a user token, so those bot tokens are only counted once at the higher rate.
basically it sends back the system prompt and entire conversation every time, but when it sends back the conversation to the bot, even things the bot has said, it counts those as user tokens. So whenever the bot replies it only gets counted with bot tokens one time, which is the time it replies. The next time around those will be user tokens.
If memory and context doesn’t matter than you don’t need an assistant you just need completions if its one shot examples you’re doing. None of it is free you’re paying for all tokenage bot and user
Hi, thanks for you response !
So money speaking, using an assistant or a simple system prompt is the same thing ?
If I generate several stories using the same thread, I would pay for the size of the prompt system only once, right ? (our prompt system is quite large)
Would that be the cheapest way to generate storizzz using my prompt system ? Or is it the same everywhere ?