Will the newly announced prompt caching work with my assistants api calls? currently I’m having to spend 3k tokens per run and most of that is coming from the same extensive instructions being re-sent every time to ensure the responses are usable.
4 Likes
I’d also appreciate clarity on this. I similarly use the same long prompt at the start of threads so it would be really helpful from a cost standpoint if these were cached.
Hello, welcome both of you.
It’s not very clear to me either, but from what I can tell in the language there is support for assistant interaction.
https://platform.openai.com/docs/guides/prompt-caching/what-can-be-cached
Certainly if Assistants aren’t supported they will be eventually.