Does gpt-4o-mini-search-preview have a completion token limit of around 1530?

herman.schutte · March 28, 2025, 10:40am

Hi everyone,

I’ve been testing with gpt-4o-mini-search-preview and gpt-4o-search-preview and whenever I try to generate a longer completion, it looks like the completion tokens are always limited to around 1530 - 1532 which causes the content to be cut off. I’ve tried this with normal text response, and json_schema response.

Using the exact same options with gpt-4o-mini, does not have this issue.

Any ideas if this is the expected behaviour?

Srushti_Pawar1 · June 5, 2025, 8:44pm

Hi @herman.schutte
I am still facing the same issue, did you figure out this issue?

aprendendo.next · June 5, 2025, 8:52pm

Did you try changing the search context size?

Available values:

high: Most comprehensive context, highest cost, slower response.

medium (default): Balanced context, cost, and latency.

low: Least context, lowest cost, fastest response, but potentially lower answer quality.

aprendendo.next · June 5, 2025, 10:29pm

After some experimenting, I can confirm that it seems to be truncating the results even with search_context_size = high.

It is difficult to make it fill the context window beyond ~1500 tokens because it keeps summarizing things really hard to fill into that.

Request example


response = client.chat.completions.create(
  model="gpt-4o-mini-search-preview",
  messages=[
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": input_text,
        }
      ]
    },
  ],
  response_format={
    "type": "text"
  },
  web_search_options={
    "search_context_size": "high",
    "user_location": {
      "type": "approximate",
      "approximate": {
        "country": "US"
      }
    }
  },
  max_completion_tokens=15000,
)

print(response.choices[0].message.content)
print(response.usage)

Total tokens: 1935 Response length (text): 6765 characters

Completion response details

{
    "id": "chatcmpl-61afc24c-6f71-42ae-a192-904065637863",
    "choices": [
        {
            "finish_reason": "stop",
            "index": 0,
            "message": {
                "content": "(redacted)......**[Alignment faking in large language models](https://www.anthropic.com/news/alignment-f ",  <---- truncated
                "refusal": null,
                "role": "assistant",
                "annotations": []
            }
        }
    ],
    "created": 1749161600,
    "model": "gpt-4o-mini-search-preview-2025-03-11",
    "object": "chat.completion",
    "system_fingerprint": "",
    "usage": {
        "completion_tokens": 1935,
        "prompt_tokens": 105,
        "total_tokens": 2040,
        "completion_tokens_details": {
            "accepted_prediction_tokens": 0,
            "audio_tokens": 0,
            "reasoning_tokens": 0,
            "rejected_prediction_tokens": 0
        },
        "prompt_tokens_details": {
            "audio_tokens": 0,
            "cached_tokens": 0
        }
    }
}

salz401 · July 24, 2025, 3:05am

I’m also facing this issue. It seems like tokens are being truncated at a certain limit and increasing output tokens is not obeyed, any potential fix here?

Topic		Replies	Views
Gpt-4o-mini responses are being cut off Community gpt-4 , gpt-4o-mini	1	321	January 28, 2025
Openai web search token limit issue Bugs	4	346	March 25, 2025
GPT-4 128K only has 4096 completion tokens API gpt-4	9	27518	February 27, 2024
Large JSON Responses from Assistant API are truncated API json , assistants-api	5	1697	June 20, 2024
Intermittent “length limit was reached” error using GPT-4o-mini via Azure — even with short prompts and completions Bugs	0	350	March 25, 2025

Does gpt-4o-mini-search-preview have a completion token limit of around 1530?

Related topics