Gpt-4-1106-preview 16385 max context tokens? (not output, total)

kyle.boddy · December 12, 2023, 7:10am

gpt-4-1106-preview shows a context window of 128k tokens on the API docs, but I am getting the following error when hitting the API:

This model’s maximum context length is 16385 tokens. However, your messages resulted in 18572 tokens (18487 in the messages, 85 in the functions). Please reduce the length of the messages or functions.

I am on Usage tier 4 and not sure why this is happening when my request for output tokens is well under 4096. Could not figure out any answers when searching the answers here.

_j · December 12, 2023, 8:54am

I just tried this out:

trial 1: expected error, expected message

Error code: 400 - {‘error’: {‘message’: “This model’s maximum context length is 128000 tokens. However, your messages resulted in 131079 tokens. Please reduce the length of the messages.”, ‘type’: ‘invalid_request_error’, ‘param’: ‘messages’, ‘code’: ‘context_length_exceeded’}}

Here is code set to send 18490 in the messages, 27 in the functions. It is set to gpt-3.5-turbo which will give expected error. Switch the commented lines from gpt3 to gpt4 if you want to pay $0.17 a test (nobody’s paying my API bill…)

import openai
from openai import OpenAI
client = OpenAI()

tools = [
  {
    "type": "function",
    "function": {
      "name": "disable",
      "parameters": {
        "type": "object",
        "properties": {
          "location": {
            "type": "string",
          },
        },
      },
    }
  }
]

msg = "!@" * (2**16)  # 131072 tokens
msg = "!@" * 9242  # string is two tokens
print(len(msg))

completion = None
try: 
    completion = client.chat.completions.create(
      # model="gpt-4-1106-preview", max_tokens=1,
      model="gpt-3.5-turbo", max_tokens=1,
      messages=[
        {"role": "system", "content": msg},
      ],
      tools=tools,
      tool_choice="auto"
    )
except Exception as e:
    print(e)
if completion:
    print(completion.choices[0].message)

You can also comment out the msg = "!@" * 9242 line which will send over the full advertised context length and get the top message I show.

kyle.boddy · December 12, 2023, 9:44am

Interesting - tried it on another box with openAI python package of 1.x and it works. On our production machine we’re on the 0.x version and upgrading the package introduces breaking changes. Guess we’ll have to refactor and just get that going here shortly and try that.

Topic		Replies	Views
API \| Max Token Error \| Tier 4 \| Fluctuating between 128000 and 4096 Bugs api	3	3653	November 30, 2023
Gpt-4-1106-preview Context Length? API	1	6746	November 9, 2023
GPT-4 API only supports 4096 context length? API gpt-4 , api	5	2389	December 19, 2023
Gpt-4-1106-preview: 400 This model's maximum context length is 4097 tokens API api , token , gpt-4-turbo	8	5566	March 18, 2024
GPT-4 128K only has 4096 completion tokens API gpt-4	9	27333	February 27, 2024

Gpt-4-1106-preview 16385 max context tokens? (not output, total)

trial 1: expected error, expected message

Related topics