How does th Prompt Caching Prefix Match work?

Hi yashwantk,

Welcome to the forum :slight_smile:

Scenario 1: You must have at least 1024 consecutive tokens the same so no

UNLESS 980 + 400 where the first 44 of the user_messages is the same

Scenario 2: You must have at least 1024 consecutive tokens the same so no
(Cache Fail at 601 of System)

Also it is the matching prefix so if the first character is different and the rest the same you have a miss

It is System+User (First 1024 combined)

“Cache hits are only possible for exact prefix matches within a prompt.”
https://platform.openai.com/docs/guides/prompt-caching