Hi there ![]()
We’re having difficulty tracking token usage for about half of our API requests. The Responses API returns {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0} in the usage field when a request includes all three of:
previous_response_id(continuing a stored conversation that contains tool calls)context_management(e.g.[{"type": "compaction", "compact_threshold": 200000}])tools(any tool definitions)
The response itself is correct — the model reasons, makes tool calls, and produces output — but the reported usage is zero. Removing any one of the three parameters causes usage to report correctly.
Reproduction Steps
Tested with gpt-5.2 via the OpenAI Ruby gem. The bug is deterministic.
require "openai"
client = OpenAI::Client.new(access_token: ENV["OPENAI_API_KEY"])
tool = {
type: "function",
name: "get_weather",
description: "Get weather for a location",
parameters: {
type: "object",
properties: { location: { type: "string" } },
required: ["location"],
additionalProperties: false
},
strict: true
}
# Step 1: Create a stored conversation with a tool call
r1 = client.responses.create(parameters: {
model: "gpt-5.2",
input: "What is the weather in Auckland?",
store: true,
tools: [tool],
tool_choice: "auto"
})
# => usage: {"input_tokens"=>49, "output_tokens"=>34, "total_tokens"=>83}
tool_call = r1["output"].find { |o| o["type"] == "function_call" }
# Step 2: Return the tool result
r2 = client.responses.create(parameters: {
model: "gpt-5.2",
input: [{ type: "function_call_output", call_id: tool_call["call_id"], output: "Sunny 22C" }],
store: true,
previous_response_id: r1["id"],
tools: [tool]
})
# => usage: {"input_tokens"=>99, "output_tokens"=>23, "total_tokens"=>122}
# Step 3: Continue with previous_response_id + context_management + tools
r3 = client.responses.create(parameters: {
model: "gpt-5.2",
input: "Thanks! What about Wellington?",
store: true,
previous_response_id: r2["id"],
tools: [tool],
context_management: [{ type: "compaction", compact_threshold: 200_000 }]
})
# => usage: {"input_tokens"=>0, "output_tokens"=>0, "total_tokens"=>0}
# ^^^^^^^^ BUG: response contains real output but usage is zero
Isolation Matrix
Starting from Step 2 above, Step 3 was repeated with different parameter combinations:
previous_response_id |
context_management |
tools |
total_tokens |
|---|---|---|---|
| yes | yes | yes | 0 |
| yes | yes | no | 7,617 |
| yes | no | yes | 130 |
| no | yes | yes | non-zero |
The bug requires all three parameters together on a conversation containing tool call history
Expected Behaviour
The usage field should report actual token counts regardless of whether context_management is present. The model is clearly processing tokens (it produces output), so the usage should reflect that.
Environment
- Model:
gpt-5.2 - API: Responses API (
/v1/responses) - Client:
openaiRuby gem - Date observed: 2026-03-03
- Reproducible: 100% deterministic