How to reduce token usage by repeating system prompt each time for batch API

Howdy! I was just reading through this documentation about the Batch API (https://platform.openai.com/docs/guides/batch).

Is there a way for me to send the system instructions to OpenAI just once (assuming they are the same for each item in the batch) so that I can avoid using tokens that will simply be duplicates?

This may be what you are looking for;

https://platform.openai.com/docs/guides/prompt-caching

Relevant thread in the community:

3 Likes

Sweet, this is what I was looking for, thanks!

2 Likes

Note that the batch API processes:

  • on OpenAI’s schedule
  • likely in parallel across many instances
  • is already discounted

So essentially, even if there were optimizations by sending system prompts and input that you have sent before, there is no additional discount offered to you of a “cache hit”.

The full text for understanding is always required.