Hello, i have a startup witch is basically a chatgpt wrapper and a very clever system-prompt
The problem is that for each user-request i use the same system-prompt which is 1500 input tokens and more 500 to 1000 tokens that are inputs from my user that changes(therefore cannot be cached)
There is any way i can cache my system-prompt so my startup can be more profitable? the profit margins are very bad today since chatgpt costs are 50% of my revenue and i really need to use gpt4o-latest since its the only one that gives me great results
Are you inserting any dynamic content in the beginning or middle of your system prompt? For example, dates/times, usernames, etc? Basically you need to have at least the first 1024 tokens to be constant for caching to take affect.
Also what is the frequency of API calls? If they are very infrequent, depending on the general API load, they may not be cached.