I’ve been working on a pricing model where the normal subscription is supplemented with a smart throttling mechanism. The throttling system basically applies an incremental throttling to the fewest number of users, only at peak usage times, to reach the spending limit. The user has the option to remove the throttling for 24 hours by paying a fee.
I spent a lot of time thinking about different ways to make the pricing work, but this was the only way I could think of to guard against a few power users making the costs very unpredictable. I also wrote a whole LangChain based SDK around it to to make its integration simpler.
curious what everyone thinks about it?