Dynamically adjusting calls based on traffic?

I’m wondering if there’s any simple way to dynamically adjust the api calls from an app using the openai api to adapt to the traffic we’re getting in order to keep costs down if we go viral. The specific case I have in mind is if we’re using dall-e image generation, in a slow month could we have the app generate high-quality images with dall-e 3, and then if we get a bunch of traffic one week because we’re featured on the verge or whatever it changes the resolution we’re requesting or switches down to dall-e 2, etc.

I can kind of imagine how I could design this functionality with some express code and a database tracking our api calls, but I’m wondering if you all know of any tools, either from OpenAI, open source, or other third parties that simplify this kind of dynamic option-setting?

Appreciate any help or suggestions! Thanks!

Considering that this form of cost-saving is mostly subjective I doubt OpenAI would introduce them.

What you want is to cut the quality down to match some arbitrary thresholds that you have set. This, to me, seems to be as simple (as you mentioned) as saving some properties in a database and then conditionally setting the model and parameters used.

Maybe instead you could consider trying to match the quality of the output to the specifications of the input. There are systems being actively developed for this exact scenario.

Our core features include:

  • Drop-in replacement for OpenAI’s client (or launch an OpenAI-compatible server) to route simpler queries to cheaper models.
  • Trained routers are provided out of the box, which we have shown to reduce costs by up to 85% while maintaining 95% GPT-4 performance on widely-used benchmarks like MT Bench.
  • Benchmarks also demonstrate that these routers achieve the same performance as commercial offerings while being >40% cheaper.
  • Easily extend the framework to include new routers and compare the performance of routers across multiple benchmarks.
1 Like

Yeah that cost-scaling makes sense for a paid service. What I have in mind is for making free demo projects that are designed to operate at a loss but within reason, for example an early-stage startup or a tool created by a non-profit or government organization.

Thanks for sharing this RouteLLM project by the way, it does seem very interesting and useful!

2 Likes

Ah I understand. I don’t know of any projects that offer this kind of switching based on thresholds.

May be a good project to start and share :person_shrugging:

No problem. It’s a pretty cool project!

Hi, to me this sounds more like a glitch in your business logic. If you get featured somewhere that’s partially because of how cool you are, right? So why lowering the quality when you have the chance to shine across the globe? Show your best to max audience whenever you can.

Then if it’s the question of money, leave the users a chance to choose the quality they want and can afford.

1 Like

The reason here is defined by the equation LTV/CAC (Lifetime value divided by Customer Acquisition Cost) see details here: https://youtu.be/jzKpAtzKQ54?si=ba5q9ih9233Sgsf2

So running this at “loss” is what will grow your CAC and worsen the equation.

Personally, I would make them pay at least API costs (can be balanced with something else you offer that costs you nothing) for 2 reasons:

  1. I don’t want goody gazers as customers so getting some money or even a card number to start will spook those away.
  2. This will solve the issue with giving away too much without controlling it.
1 Like