Need help on how to approach the API usage metric for user of the app

Hi guys,

Any clear instructions for a noob on how to approach the problem:

If I develop the API to my app using OpenAI, how would I count token used by app users?

Ideally would be logging the volume of tokens used per each user request/response and store them in my backend to be able to generate bill based on volume used monthly.

Setup would be something like so:

  • My backend connects to OpenAI API using my key only.
  • When user interacts with my public API to get results, my backend logs their tokens used per each request/response.
  • Along with my API responses I send the volume of the response to the user to see the “request cost”.
  • At the end of the month I generate a bill to the users based on their volume.

Hope the explanation was not too messy :wink:

Thanks

1 Like

Hi there, your setup looks great! You could simply pass user input through a tokenizer, and log tokens with your corresponding rates accordingly.

1 Like

So if I get your right, I need to give the user input to tokenizer before the request to OpenAI, what about the OpenAI reply?

Use case example:

Generate reply to a comment:

  1. get API user request to my .comments/reply endpoint
  2. get user comment and metadata from the request
  3. use search endpoint to get most relevant comments and metadata from provided file
  4. construct prompt with instructions + most relevant examples of comments and replies (stored in metadata) + user comment + other user comment metadata
  5. send prompt to completion endpoint for the engine to reply to the comment
  6. get the response from OpenAI
  7. log the total tokens used in this API request to backend associated with API user ID
  8. respond to API request with reply from AI and tokens volume

At what point should I use tokenizer to get the token volume? Between steps 4 & 5 or 6 & 7?

1 Like

Do you mean the difference in rates between the engines or my rates applied to tokens used?

At which step(s) you call the tokeniser is totally up to you. If you want to count the tokens of a user’s input, then you can count just those, or you could count the completion tokens, or both.

In terms of rates, that’s again up to you- maybe your pricing is based off the difference in the actual pricing of the engines, or something separate.

Yes, the both, input and output, so that I finish with the total “cost” of the transaction before I log it to my backend and send to the user. How to count both?

I kinda guessed. A tokenizer takes time and compute even if it’s in the background. My credits system is still a work in progress, but they’re basically around $0.01+ per credit. Each generation costs a set number of credits. I believe 25 credits is the highest at the moment. As I said, it’s a work in progress, though.

The tokenizer will give you exact amounts, though, so maybe it’s something you want to use. Your outline looks usable.

Monetization strategies are something I’ve been watching closely since last October…

2 Likes

I see, thanks for feedback.

Seeing my setup it is not really in background. Maybe I need to think over it again. Something like log the prompts texts and reply text during the request and then run another request to tokenizer in background. Is this how you’re seeing this?

Yes, I’d like to have exact amounts for basically 2 reasons (as of today):

  • Be able to produce exact bill to the user on their volume
  • Be able to evaluate the “efficiency” of my app by looking in how much it costs to produce validated output (users will be able to “retry” the generation if they do not like the result)
1 Like

Yeah, I don’t think the user will want to wait too much longer?

Please keep us updated on your progress.

For sure, but if the app pitch explains them how much time they save by using the service, I think they will be more “flexible” :wink:

Sure, I’ll post updates here.

Meanwhile if someone is willing to discuss a possible position of API developer in our company TechSpokes Inc - I’m open to look through candidates.

1 Like

That was more of an irony. The goal of the app (ideally, somewhat tested in playground + previous experiences) is to save about 70% of the manager’s time dealing with the text. At their average rate of 100 USD / hour (managers working themselves and/or their employees) and the approximate cost of app usage around 20 USD / hour. The potential saving for the business will be at 280 USD / hour (1 hour of labor cost + 1 hour of app usage cost to produce the result close to 3 hours of human labour without the app). I think they will get the idea of “value” within a matter of seconds. As for my “reseller” net margin, I bet it will be no lower than 12K%.

Personally, I think it is worth investing some 15K USD to build a machine that produces 12K% margin.

2 Likes

For sure, among our founders there is a very skillful copy writer that has been producing big money making texts for decades… we could not find a way to “replicate” him… Now we do.

2 Likes

Yeah, that’s what I’ve found lacking in a lot of the copywriting AI tools… You really need great examples to feed GPT-3 to get good output. I spent a lot of time writing 100,000+ words for my LitRPG Adventures, but the output is great.

Another thing I’ve been working on in the background is a blurb writer. I want to give GPT-3 great examples of blurbs then have it turn a couple sentences into a blurb. I think writers would easily pay $50/blurb if it worked well and saved them time. And the cost (once I figure out how best to do it) would not be nearly that much in AI cost… It’s a bright future indeed.

2 Likes

I don’t think we’ve reached Post-Capitalism yet, but we’re heading in that direction with AGI getting closer all the time. To be fair, I’m not sharing everything. There’s quite a few people sharing lots of good stuff too. Gwern comes to mind. Max. Janelle. And others. We’ve got a good community here too, it seems.

2 Likes

I’m not sure how it would be applicable to the blurb writer, but these are general ideas on the workflow we came up with. Let me know what you think of this approach.

The writing of anything is a multistep process that can be broken down to a “framework” like so:

  1. Define the goal
  2. Gather base info
  3. Do some research on the subject and produce a “first draft”
  4. Get some feedback on the “first draft” and refine it
  5. Compare the “first draft” with refined “output”
  6. Try to improve the refined “output” as if it was the “first draft”
  7. Get some feedback on the “pre-final copy” to get the “final copy”
  8. Learn from the above

Here it may translate to something like this:

  1. define the goal - craft a prompt for text generation
  2. gather base info - get user input
  3. do some research - do search in your “final copy examples” file where text is user inputs from before and final copies are in metadata, gather those examples and add them to the prompt, then add user input and request the “first draft”
  4. get some feedback on the “first draft” - let user edit the first draft produced by the machine
  5. compare the “first draft” with refined “output” - add it as an example to another file (“refined examples” where “first draft” - text, “output” in meta data)
  6. improve the refined “output” - using a different “refine” prompt, search in “refined examples” using the “output” as query, add found examples to the prompt, add “output” to the prompt and generate “pre-final copy”
  7. get some feedback on the “pre-final copy” - let user edit the “pre-final copy” (second edit) and save it back to “refined examples” where the “pre-final copy” is text and “final copy” in metadata
  8. learn from above - add the user input to the “final copy examples” as text and “final copy” as meta data

Hope it makes sense. Definitely the files will be way too large at some point, so may need to think about the “votes” mechanism to keep only the best examples in the files. Also, the files should be in one domain of application in order not to mix the oranges with apples.

And as you can see the OpenAI tokens will be used at max in this approach, as you would basically use around 10K tokens to get final result from user input - the price for quality.

What do you think?

2 Likes