I have a question about OpenAI API pricing structure. I understand the token-based pricing, but I’m wondering about the relationship between API calls and total cost:
Is there a difference in cost between these two scenarios?
Making a single API call that processes 1,000 tokens
Making multiple API calls that total 1,000 tokens (e.g., 10 calls with 100 tokens each)
I noticed that GPTForWork’s pricing calculator includes a separate ‘Price per API call’ metric for OpenAI, which made me question if there’s an additional cost per API call beyond just token usage.
In terms of cost, there’s no real difference for the most part. It’s possible that there might be a caching consideration if you wait more than a few minutes between calls (https://platform.openai.com/docs/guides/prompt-caching), and it’s likely I suppose that requests sent at the same time might miss each other’s caches, so if you want to hypothetically optimize for cost you’d send multiple requests that start with the same context in a sequential fashion.
The last thing that comes to mind (but you probably know this) is that each request prices the context again. Apart from using caching (as above), if you have a massive context and can perform all your generation tasks in one shot, then that saves you from paying 10x the context cost (or 1+~9*0.5, whatever the case may be with caching.)
About that product you mentioned: It looks like it’s just an api wrapper - what these people do for their own cost tracking doesn’t necessarily have to have anything to do with OpenAI pricing.
But as you can tell, the TL;DR: is, it’s complicated. But not that complicated. But sometimes difficult to predict.