Chat Instruct response being truncated, reason given: finish_length

lenwhite6094 · September 27, 2023, 10:42pm

I’m switching from gpt-3.5-turbo-0613 to gpt-3.5-turbo-instruct and I’m running into an issue.

Here’s an example prompt:

{
  "model": "gpt-3.5-turbo-instruct",
  "prompt": "Using the following statements as your corpus, write a clear, actionable, and factual sentence -- and one sentence only -- using proper grammar, punctuation, and capitalization. Use no more than twenty words and should be as concise as practical. Avoid referencing the task or the text in your response. For example, if the statements are about the benefits of regular exercise, a desirable summary could be: 'Engaging in regular exercise improves physical and mental health, enhancing overall well-being.':\n\nYou get challenging projects to work on and you have a chance to make a real impact.\nDynamic nature of work and challenging projects."
}

and here’s the response:

{
  "id": "cmpl-83XGzqj3iDEkZLgZDzTXbcRuPcSXt",
  "object": "text_completion",
  "created": 1695853577,
  "model": "gpt-3.5-turbo-instruct",
  "choices": [
    {
      "text": "\n\nEngage in dynamic, impactful work on challenging projects to hone skills and make",
      "index": 0,
      "logprobs": null,
      "finish_reason": "length"
    }
  ],
  "usage": {
    "prompt_tokens": 122,
    "completion_tokens": 16,
    "total_tokens": 138
  }
}

What’s happening is that consistently the text field is truncated. This was not a problem with gpt-3.5-turbo-0613 using the same prompt. I’ve gone over the documentation but am not seeing anything I should be doing. Am I missing something?

_j · September 27, 2023, 11:56pm

With ChatCompletions, the default max_tokens is infinite.
With the Completions endpoint, the default max_tokens is 16.
You need to set it to the maximum length of the desired output (reserved from the context length) if you expect more than a few words.

An example with options spelled out:

    response = openai.Completion.create(
        prompt      = string
        model       = model_name,
        temperature = temperature, # start at 0.6
        max_tokens  = max_tokens,  # maximum response length
        stop        = "", # often needed for completions
        top_p       = top_p, # reduction from 1 to 0.95 useful
        presence_penalty = 0.0,  # penalties -2.0 - 2.0
        frequency_penalty = 0.0,  # frequency = cumulative score
        n           = 1,  # gets you multiple trials
        stream      = True,
        logit_bias  = {"100066": -1},  # example, '～\n\n' token
        user        = "site_user-id", # optional customer tracking
    )

lenwhite6094 · September 28, 2023, 12:17am

I see. Thank you very much. I must have missed that.

Topic		Replies	Views
Inconsistent Token Limits with “o3-mini-2025-01-31” Model—Empty Response Despite Supposed Large Context? API api , limitations , system-limitation	2	813	March 4, 2025
Incomplete or truncate result API	15	7297	August 30, 2023
Max_tokens seems to do nothing for me 3.5 Turbo API	14	3315	December 18, 2023
Maximum Context Length Error with gpt-3.5-turbo-16k Models API gpt-35-turbo , api	5	6669	December 15, 2023
Text-davinci-003 - Completion API - choice[0].text begins with sentence fragment API text-davinci-003	10	2240	July 11, 2023

Chat Instruct response being truncated, reason given: finish_length

Related topics