Responses API returns zero usage when combining `previous_response_id` + `context_management` + `tools`

Hi there :waving_hand:

We’re having difficulty tracking token usage for about half of our API requests. The Responses API returns {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0} in the usage field when a request includes all three of:

  1. previous_response_id (continuing a stored conversation that contains tool calls)
  2. context_management (e.g. [{"type": "compaction", "compact_threshold": 200000}])
  3. tools (any tool definitions)

The response itself is correct — the model reasons, makes tool calls, and produces output — but the reported usage is zero. Removing any one of the three parameters causes usage to report correctly.

Reproduction Steps

Tested with gpt-5.2 via the OpenAI Ruby gem. The bug is deterministic.

require "openai"
client = OpenAI::Client.new(access_token: ENV["OPENAI_API_KEY"])

tool = {
  type: "function",
  name: "get_weather",
  description: "Get weather for a location",
  parameters: {
    type: "object",
    properties: { location: { type: "string" } },
    required: ["location"],
    additionalProperties: false
  },
  strict: true
}

# Step 1: Create a stored conversation with a tool call
r1 = client.responses.create(parameters: {
  model: "gpt-5.2",
  input: "What is the weather in Auckland?",
  store: true,
  tools: [tool],
  tool_choice: "auto"
})
# => usage: {"input_tokens"=>49, "output_tokens"=>34, "total_tokens"=>83}

tool_call = r1["output"].find { |o| o["type"] == "function_call" }

# Step 2: Return the tool result
r2 = client.responses.create(parameters: {
  model: "gpt-5.2",
  input: [{ type: "function_call_output", call_id: tool_call["call_id"], output: "Sunny 22C" }],
  store: true,
  previous_response_id: r1["id"],
  tools: [tool]
})
# => usage: {"input_tokens"=>99, "output_tokens"=>23, "total_tokens"=>122}

# Step 3: Continue with previous_response_id + context_management + tools
r3 = client.responses.create(parameters: {
  model: "gpt-5.2",
  input: "Thanks! What about Wellington?",
  store: true,
  previous_response_id: r2["id"],
  tools: [tool],
  context_management: [{ type: "compaction", compact_threshold: 200_000 }]
})
# => usage: {"input_tokens"=>0, "output_tokens"=>0, "total_tokens"=>0}
#    ^^^^^^^^ BUG: response contains real output but usage is zero

Isolation Matrix

Starting from Step 2 above, Step 3 was repeated with different parameter combinations:

previous_response_id context_management tools total_tokens
yes yes yes 0
yes yes no 7,617
yes no yes 130
no yes yes non-zero

The bug requires all three parameters together on a conversation containing tool call history

Expected Behaviour

The usage field should report actual token counts regardless of whether context_management is present. The model is clearly processing tokens (it produces output), so the usage should reflect that.

Environment

  • Model: gpt-5.2
  • API: Responses API (/v1/responses)
  • Client: openai Ruby gem
  • Date observed: 2026-03-03
  • Reproducible: 100% deterministic
1 Like

Hi and welcome to the community!

Thank you for raising this with the very helpful steps to repro the issue!
Subsequently I was immediately able to reproduce the core issue: previous_response_id + context_management + tools returned usage.total_tokens = 0.

During my testing I also found another variant that returns zero token usage: no previous_response_id + context_management + tools

Will flag this to the team!

1 Like

Hello there,

We tried reproducing the usage-reporting issue you described, but we were not able to reproduce it on our side with the current setup.

If you have any request IDs for affected calls, please send them over and we can inspect those directly. If you also have a minimal script that still reproduces the issue consistently, that would be helpful as well and we can try to reproduce it from that exact example.

1 Like

Hi!
The error stopped occurring a few days after this has been escalated.

2 Likes

Thanks! I can confirm this is no longer an issue

1 Like