Is this a problem with cached tokens?

Thanks for trying that. I’m wondering if this actually bug? If OpenAI has based their caching on indexing by the user prompts, instead of by the system and user prompt we would expect to see this behavior. I don’t doing that explicitly, but some path through their caching code that ends up doing by somehow ignoring the system prompt. Either way, it’s anomalous behavior and does create problems for us.

1 Like