API | Max Token Error | Tier 4 | Fluctuating between 128000 and 4096

kristopher.takken · November 30, 2023, 6:45am

My account is on Teir 4.

I tried using: max_tokens= 128000 and got the following error:

This model's maximum context length is 128000 tokens. However, you requested 128295 tokens (295 in the messages, 128000 in the completion). Please reduce the length of the messages or completion.

So I used GPT2Tokenizer.from_pretrained(“gpt2”) to get the token length of my message to subtract it from my max tokens

Rerun it again and got the following error:

max_tokens is too large: 127688. This model supports at most 4096 completion tokens, whereas you provided 127688.

It seems the goal posts of my max allowable tokens have been moved???

model_id = 'gpt-4-1106-preview'

def count_tokens(message):
    tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
    tokens = tokenizer.encode(message)
    return len(tokens)

def chat_gpt_conversation(conversation_log, api_key = sk):
    try:
        text = ""
        for i in conversation_log:
            text += i['content']
        context_count = count_tokens(text)

        openai.api_key = api_key
        response = openai.ChatCompletion.create(
            model=model_id,
            messages=conversation_log,
            temperature=temperature,
            max_tokens= 128000 - context_count 
        )
        return response

    except Exception as e:
        print((e))
        return None

_j · November 30, 2023, 9:13am

You’ve exceeded the model’s limitations in two different ways and get two different API errors.

First you went over the total context length.
Then you exceeded the maximum output that is allowed.

You are counting the tokens wrong.

GPT-4 doesn’t use GPT-2’s tokenizer, it uses a token encoder three generations beyond that called cl100k-base. The library to use is tiktoken.

Solution: set max_tokens 4096 or less, as that is all the output that OpenAI allows, and you will need to use jailbreak prompt engineering to get this gimped model to output anywhere near that.

Yes, you read that right: The maximum that gpt-4-turbo models (with 128k) will output is 4k tokens. For OpenAI, input is near-free and billable, output costs compute.

How about this response reported on Reddit? “As an AI developed by OpenAI I aim to follow guidelines and policies that prioritize ethical considerations, user safety, and the responsible use of AI. One of these guidelines restricts me from generating full, complete solutions for complex tasks, especially when they involve multiple advanced technologies like image processing, machine learning, and database management.”

kristopher.takken · November 30, 2023, 9:26am

Thank you for taking the time to explain. Greatly appreciated.

_j · November 30, 2023, 9:43am

Happy programming to you to!

If you want to have more fun inspiration, imagine sending the current question and a bit of context to gpt-3.5-turbo, along with the ideal amount of input context, and asking it to classify the complexity of answering needed.

Enough rules, looking at whether you want to chat, code, summarize, or write at length, and that AI could make a per-input determination of the best of gpt-3.5-turbo-1106 (16k but 4k out), gpt-3.5-turbo-16k, GPT-4 (smarter), or gpt-4-preview (turbo), especially when the last can cost over $1 per call.

Topic		Replies	Views
Gpt-4-1106-preview 16385 max context tokens? (not output, total) API gpt-4	2	3189	December 12, 2023
GPT-4 128K only has 4096 completion tokens API gpt-4	9	27207	February 27, 2024
Getting error 400 400 The input token count exceeds the maximum number of tokens allowed (1000000) API	1	651	December 18, 2024
GPT-4 API only supports 4096 context length? API gpt-4 , api	5	2378	December 19, 2023
Not enough tokens error, even though I've paid A LOT (maximum context length error) API api	5	5924	September 9, 2023

API | Max Token Error | Tier 4 | Fluctuating between 128000 and 4096

Related topics