Is it possible to enable prompt caching manually for prompts with less than 1024 tokens?

we could make great use of this feature but our prompts are only 700 tokens. is there any way to enable the feature manually?

Hi @leeflix and welcome to the community!

There is no way to enable caching manually for <1024 tokens. For reference, Gemini (Google) requires minimum 32k tokens before cache is activated, and using Anthropic you need 1k minimum for Opus and 2k minimum for Haiku.

So I would say there is some optimization threshold here that is tuned, so for less number of tokens the latency payoff is negligible.

1 Like