Batch API - System Prompt Caching - Is it possilbe to cache system prompt from single batch job and reuse it across multiple batches?

Let’s say I have two batch jobs below that use the same “Repetative long system instructions”. Is it possible to cache it in job 1 and resuse the cached instructions in job 2?

file_1.jsonl
  {
    "custom_id": "request-1",
    "method": "POST",
    "url": "/v1/chat/completions",
    "body": {
      "model": "gpt-4o-mini",
      "messages": [
        { "role": "system", "content": "Repetative long system instructions" },
        { "role": "user", "content": "Unique content 1" }
      ]
    }
  }

file_2.jsonl
  {
    "custom_id": "request-2",
    "method": "POST",
    "url": "/v1/chat/completions",
    "body": {
      "model": "gpt-4o-mini",
      "messages": [
        { "role": "system", "content": "Repetative long system instructions" },
        { "role": "user", "content": "Unique content 2" }
      ]
    }
  }

You’d never know if any cache was used.

A batch job is already discounted 50% for the whole thing. There is no need to try to activate a cache, nor is context caching even described as working for a batch, although sorting the individual jobs by their input could optimize the file best for it, simply to provide benefit to OpenAI.

There is no order to whether one batch job would necessarily go first, and they all could run with high parallelization, with no attempt to reuse the same server, a current requirement for OpenAI’s mechanism. You’d have to wait for a new product that doesn’t exist, “serialized batch with cache”.

1 Like