Using fixed prompt templates with over 4000+ tokens did not trigger the half-price caching discount

I used two fixed templates, A and B, with A being around 700 tokens and B around 3300 tokens, totaling 1000 interactions. I sent 100 interaction requests in each batch. In the end, the tokens did not trigger the half-price caching discount. Could somone clarify whether the caching discount requires multi-turn interactions to be triggered? My interactions were single-response, with about 1000 independent requests per interaction.
knowledge_system:3300 tokens
mapping_rules:700 tokens
‘problem_data’: about 1000 tokens


def format_prompt_for_model(self, prompt: Dict[str, Any]) -> str:
    try:
        problem_uuid = prompt['problem_data']['problem_uuid']

        formatted_prompt = (
            "Please complete the knowledge mapping task according to the following requirements. Your response must:\n"
            "1. Strictly output in JSON format\n"
            "2. Include all required fields\n"
            "3. Do not add any extra text\n"
            "4. Include the problem ID in the returned JSON\n\n"  # Clearly specify requirements
            f"Problem ID: {problem_uuid}  // Use this ID in the output JSON\n\n"  # Explicitly instruct to use this ID
            f"Knowledge System:\n{prompt['knowledge_system']}\n\n"
            f"Mapping Rules:\n{prompt['mapping_rules']}\n\n"
            f"Problem Data:\n{json.dumps(prompt['problem_data'], ensure_ascii=False, indent=2)}"
        )

        return formatted_prompt
    except Exception as e:
        logger.error(f"Error formatting prompt: {str(e)}")
        raise

Blockquote

Hi there!

Is the problem ID unique for every request?

yes, it’s 1000 problem uuid

Ok, so if these values are different in every request, then that currently interferes with the caching. Your static content must be of at least 1024 tokens. So you should place all inputs that remain entirely unchanged first in order. The parts that are variable, such as the Problem ID, should then be placed towards the end.

2 Likes

Thank you very much.
I will try it.

1 Like

put the problem uuid below the fix prompt , it’s works, thank you.

1 Like