How are you all handling pricing of B2B or B2C products/services?

Very curious as to what others’ thoughts are on this, specifically regarding one aspect of API pricing/cost - price elasticity of token usage.

I spent my afternoon defining potential price points that I thought were “reasonable” to users and would offer “reasonable” profit, considering all other costs. Arrived at a nice medium. Then I realized - even a small change in the cost of API usage, particularly gpt-4 which is currently at $0.03 input and $0.06 output per 1000 tokens via the 8K context model, has very large consequences.

Now, what makes potential price elasticity particularly important for me is that my non-overhead costs are basically 100% API related, and I can’t foresee users being sold on a “per token usage” pricing model. That just seems to create huge headaches regarding billing and sales…

I’m not saying it can’t be done, but the best-case options are certainly far from ideal. How do you abstract a “per 1000 tokens” pricing package to people who don’t really understand API pricing in the first place? I think it’s a lot to swallow as a user, and a flat fee pricing model that is based on something more tangible (e.g., in my case, “word count”) is much easier for users to digest, plan for, and be sold on. At least that is the context of my particular product.

So, given a flat fee pricing model that basically tries to track and estimate token usage by assigning a base token value per word, I’m at an impasse. The possible answers are kind of obvious. 1) plan for it at the start by creating a very comfortable profit cushion and just eat the costs later if OpenAI raises pricing, or 2) raise prices along with API costs, or 3) switch to a per token (or per 1000 tokens) pricing model to assign exact margins to token usage, then bill by token usage. Options #2 and #3 seem particularly user unfriendly, but I guess the true reality of the situation is that NONE of the options are particularly user friendly.

Wondering how everyone else is handling this?

Charging monthly based on a worst-case scenario using rate-limits seems to be the best of both worlds. Users don’t feel limited (even though they are), and they can’t cost you anymore than you expect.

You could also offer a per-token tier for extreme users that expect to go past your rate-limits.

Although I would be very surprised to see a price increase, and not a decrease, I’m sure users would understand if you had to raise prices as a result.

Although don’t be surprised if they expect a price decrease if the price ever went down (even though it never happens. As they say: “Prices take the elevator up, and the stairs down” :sob:)


Yeah that is probably the way to go. I suppose I am just asking for the impossible “how to create perfect pricing” it just doesn’t exist in any business. Your per token idea is definitely in the mix regarding extreme/enterprise pricing.

Appreciate the response!

1 Like

At a very basic level what kind of model would this be? Who knows, maybe there is a better model that can be built on.

Personally, I tend to forget about subscriptions and they (for some strange reason) are usually easy to start, and hard to cancel. For that reason I dislike them.

Thanks! Yes, there’s always people who just want things to be as stupid simple, consistent, and straightforward as possible, and then those who prefer to micromanage everything. Both with their perks. Thinking about it now this is basically OpenAI’s ChatGPT & API model :sweat_smile:

1 Like

Well for me, regarding actual pricing to the user, “perfect pricing” would definitely match the exact token usage of the user. I guess that is probably true in most cases. And that’s why I think your idea regarding enterprise users is a front runner - enterprise users will be much more keen on cost, since they are dealing with much larger numbers.

But for the majority of users, I’m not so sure. Some examples… what if a user need outputs at different lengths? What if there is some spillage, and users don’t like certain outputs and need to rerun? What if gpt puts out an output that is slightly longer than the requested length? Lots of things to account for in pricing that just makes this very hard for both me and the users to reliably estimate what the exact needs (or in my case, costs) will be.

These are things likely hashed out at length with potential enterprise users, but for a single user or maybe small business user, those conversations will typically not happen.

Flat pricing essentially packages all potential complexity into digestible bundles for users (and for me!).

I would say roughly half of my competitors have such utterly bewildering pricing schemes that even I struggle to understand exactly what is being offered. So… yeah, another part of “perfect” pricing is actually competitive positioning, as that is clearly a tough issue to solve, at least for my particular area.

1 Like

The cost of inferencing is only going to go down with time, quite a few larger players are pricing at cost right now in the hopes of market share when costs drop, which is fine if you have the bankroll to do it, otherwise set something that will keep you profitable for now, with the knowledge that it’s going to get cheaper as time goes on.

1 Like

But using 8K context as a base, wasn’t the cost of gpt-4 prompt/completions set at $0.03 a few months ago in May-ish? And now it’s $0.03 / $0.06? Or am I misunderstanding your point

GPT-4 8K has always been 0.03 for prompt tokens and 0.06 for completion tokens (in/out) so there has been no change for that. My point was that for any technology, cell phone call time, internet bandwidth, etc, the cost per unit tends to zero as time progresses. Delaying product launches or reducing R&D on projects due to cost today will almost always be a mistake, as those costs will go down over time and what matters is building a customer base.

1 Like

I see, I must be thinking of old gpt-3. My bad!

Interesting information. In my head, I’ve been thinking about it in the opposite way.

Much much appreciated.

1 Like

Services that requires a pen and paper to understand their pricing :-1:

Very cool thoughts.

I think based off of what you are saying: the pricing scheme should reflect what your target audience wants. If I wanted to be a heavy user I would really appreciate a very granular pricing scheme. If I am just a casual user, I want it to be simple.

Interactive Brokers employs this really well with their pricing strategies. If you want it to be granular, you can, and you will save money if you play your cards correctly.

Otherwise, they can manage it for you with a flat usage fee.
So if I just want to wake up, check my stocks, buy a couple, it’s easy.

If I want to day trade, I can take their granular approach, do the math, and adjust my personal strategy accordingly.

Yes this has been an issue for us, I have spent the last two weeks figuring out a user friendly way to charge for api use…

We were trying to avoid the “per 1000 tokens” packages as we want to make it as accessible and cheap as possible.

For the app (blog co-writer app), I decided to charge a $10 per month subscription. The subscription includes $5 worth of api cost (prompt and input) per month. The unused credit does not roll over month to month. So $5.00 for app access and 5.00 worth of api calls. If a user exceeds this limit, the have the option to buy extra credits (min 1.00 - max $10.00) to finish off the month. Again these do not roll over to the next month.

We are not up-charging on the api costs, we calculate using the exact same prices as OpenAi. All payment handling is done through Stripe. So we might lose a little on the api costs, with fees but it makes it simpler.

Now this model might not work for everyone. Just from our own testing the average blog costs between $0.01 and $0.10, so $5.00 would allow you to create at least 1 blog a day for a given month.

But ya, it has been really tricky trying to figure out a simple and clear way to charge users.

1 Like

I like the business model of having one single monthly application subscription amount of which $1 is granted for “AI usage” (i.e. token expense). Then whatever token credits they have rolls-over to the next month, and the user can top-off their account by purchasing an additional $1 or $5 or $10 of API Credit any time they want, and they never loose it. It just continues to accrue if not used.

Then I show the actual “AI Credit Remaining” on the screen in fine print at all times so they know if/when they need to purchase more, in the middle of a billing cycle. Doing this approach you could offer even a $2/mo pricing, where extremely heavy users could spend $50/mo if they need to do that much “usage” and you still get your $1 service fee, and never have to worry about a heavy user spending too much, nor ever have to tell users they have “a limit”.


I’d just add that if you are going to introduce rollovers to accounts, you should limit the rollover amount to some percentage of a typical monthly spend as a maximum, I’ve seen projects that have failed due to massive rollover use arriving at unexpected times, cashflow is king for small enterprises and a rollover can hit you in the bank just at just the wrong time.

I meant rollover in the sense of a “prepayment”. A massive rollover in that context would just mean they’ve put a lot in MY company bank account. :slight_smile:

Companies tend not to keep money just sitting in their account waiting for users to spend it later, if you have a cash excess and wish to be randomly tested as to your future liquidity… sure. It is really not a good idea to rollover anything more than a % of last months balance as a maximum, even cell phone and internet plans rarely do this, and those that do get phased out after they get stung.

This comes from personal experience :smile: an unpleasant experience.


You’d just make it $2 for the first month and then $1 per month after that. This way you charge $1 every month (as your actual subscription cost), and other than that it’s up to the user to top off their account balance any time they want more AI credit. The extra dollar for their first month is so they start out with AI credits to use. It’s just like what OpenAI is doing. I can pay in advance a certain amount and it’s up to me to buy more if I run low.

I asked similar questions recently, and got this response, which I really like: End-User Pay-As-You-Go Model - #12 by pieter

Don’t even mention tokens, except in the fine print when they ask how you arrive at the pricing. $10 a month plus usage “credits”. You gotta buy quarters for the machine, sort of thing.

You think anyone will complain about losing X dollars in credit each month? I was thinking no rollovers myself, but then, I probably wouldn’t like it much if my credit amount didn’t rollover. Still a quandary for me.


Ya I don’t think it is the best either but I think it will be an edge case more so for our app (BlogeaAi), as $5.00 credit /month should be enough to get by for most people. The top ups can be as low as a $1.00 so I don’t think it is going to be that big a deal. But ya there is trade-offs with all these solutions.

Yes. Oh yes. Some people will complain about anything if they are given the opportunity.

I don’t understand why anyone would want to complicate things by saying “We will give you $10 in credit that doesn’t rollover” when you can instead say “this costs $10/month” unless your service EXACTLY reflects the “units/credits” that you are charging.

The most similar pricing scheme that follows this ideology is Google Colab. Which makes sense because you are specifically paying for computations.

One can also refer to app providers from other AI tools, like Midjourney for example:
Offer menu pricing and include model types and the number of requests per month into a package plus extra services to make the higher priced tiers more attractive. The users appear to accept and understand the possibilities and limitations of their subscription.
Admittedly, unless one calculates the number of requests based on the max number of tokens this LLM specific problem isn’t solved completely but then again it surely puts a floor under the max possible cost a user can generate.
Following this approach it’s also possible to determine the max expenses per month and keep this amount available in a segregated account.
I’d claim that the beauty of this approach is the transparency and simplicity.

1 Like