BATCH Api errors max_tokens is too large

szX3EM · May 6, 2024, 7:31pm

I fixed 1st error by reducing from max_tokens=100K to 86K, now it says 4096 ???
I’m confused. this is 128K model gpt-4-turbo-2024-04-09
I’m doing 7 papers each from 20K to 70K
I have system prompt 1 user prompt and 2nd used prompt is specific article text
Looking for working example with papers Thanks

{“error”: {“message”: “This model’s maximum context length is 128000 tokens. However, you requested 141226 tokens (41226 in the messages, 100000 in the completion). Please reduce the length of the messages or completion.”, “type”: “invalid_request_error”, “param”: “messages”, “code”:

max_tokens is too large: 86000. This model supports at most 4096 completion tokens, whereas you provided 86000

jr.2509 · May 6, 2024, 7:42pm

Welcome to the Forum!

Max_tokens refers to the number of output tokens, which is 4096 for all of the recent models. So the value you set for max_tokens must be set to 4096 or lower.

128,000, on the other hand, refers to the total context window, which is the sum of input and output tokens.

In order to avoid the error you are experiencing you have to ensure that your input tokens do not exceed 124k (or slightly higher depending on the number of output tokens you are looking to produce).

szX3EM · May 6, 2024, 7:45pm

How do I calculate my input tokens do not exceed 124k?
so if max_tokens is always 4096 why I even have to specify it?

jr.2509 · May 6, 2024, 8:04pm

you technically don‘t have to specify max_tokens - it‘s only needed if you want to set a hard limit for the output tokens (this may come at the risk of output being cut off randomly).

Check the OpenAI cookbook for the python script for counting tokens and then just add that step to your script:

szX3EM · May 6, 2024, 8:43pm

Error processing 2404.18616v1.txt: Unknown encoding cl100k_base. Plugins found:
Could not find encoding for model gpt-4. Using ‘cl100k_base’ as a fallback. Error: Unknown encoding cl100k_base. Plugins found:

def num_tokens_from_string(string, model=“gpt-4”):
“”“Returns the number of tokens in a text string according to the model encoding.”“”
try:
encoding = tiktoken.encoding_for_model(model)
except Exception as e: # Catch all exceptions as “encoding_for_model” may raise various types
print(f"Could not find encoding for model {model}. Using ‘cl100k_base’ as a fallback. Error: {e}")
encoding = tiktoken.get_encoding(“cl100k_base”)
num_tokens = len(encoding.encode(string))
return num_tokens

_j · May 6, 2024, 11:02pm

This post has a quick script I wrote to count per-example tokens on a JSONL file, in the process validating the file format. It does not deal with functions.

You can omit the max_tokens parameter from API calls, leaving the AI to potentially produce the maximum, or accept an input so large there is little context length remaining to form an answer.

A typical max_tokens setting for a response is 2000, returning an error if input + output reservation is over model capabilities, ensuring enough space is available to form a complete response.

szX3EM · May 6, 2024, 11:37pm

So my small 7 files batch just finished.

1st task did not execute properly returning only 14 bytes

How do I make sure 1st one executes? or do I just do a dummy one ???
Batch 2 did not do Task 3 completely. I ask for 20 it gave me only 10

szX3EM · May 7, 2024, 1:00am

Fine-Tuning of my personal blog

import json
import tiktoken

class Tokenizer:
    def __init__(self, model_name):
        self.tokenizer = tiktoken.get_encoding(model_name)

    def count(self, text):
        encoded_text = self.tokenizer.encode(text)
        return len(encoded_text)

    def message(self, message):
        for msg in message:
            role_string = msg['role']
            if 'name' in msg:
                role_string += ':' + msg['name']
            role_tokens = self.count(role_string)
            content_tokens = self.count(msg['content'])
            msg['tokens'] = 3 + role_tokens + content_tokens
        return message if len(message) > 1 else message[0]


cl100k = Tokenizer("cl100k_base")
# Example string
trainfile = """
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already."}]}
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}, {"role": "assistant", "content": "Oh, just some guy named William Shakespeare. Ever heard of him?"}]}
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "How far is the Moon from Earth?"}, {"role": "assistant", "content": "Around 384,400 kilometers. Give or take a few, like that really matters."}]}
""".strip()

# Uncomment the following lines to read from a file instead
# with open('filename.jsonl', 'r') as f:
#     trainfile = f.read()
max_line = 52  # report over this many tokens (4080)
tokentable = []; over_max_line = []; total_tokens_list = []
for i, line in enumerate(trainfile.split('\n'), start=1):
    try:
        jsonl_line = json.loads(line)
        token_msg_list = cl100k.message(jsonl_line['messages'])
        total_tokens = 3  # overhead
        # Iterate through all the dictionaries in the list
        for msg in token_msg_list:
            # Get all the token values and add them
            total_tokens += msg.get('tokens', 0)
            if total_tokens > max_line:
                print(f"line {i}:Total tokens are over {max_line}")
                over_max_line.append(i)
        total_tokens_list.append(total_tokens)
        #tokentable.append(token_msg_list)
    except json.JSONDecodeError:
        print(f"Error decoding JSON on line {i}: {line[:320]}")

print(total_tokens_list)

Thanks it worked. Apparently for Pycompiler u need to specify tokenizer include too

Topic		Replies	Views
openai.error.InvalidRequestError: Token limit exceeded HOWEVER the input, prompt, and output are far below the token limit API api	5	7556	February 9, 2024
API \| Max Token Error \| Tier 4 \| Fluctuating between 128000 and 4096 Bugs api	3	3658	November 30, 2023
Small input token limit when fine-tuning gpt-4.1-mini API fine-tuning	1	41	July 23, 2025
Chat GPT4 1106 vs ChatGPT 4: Impressive drop in quality API gpt-4 , chatgpt	27	15618	February 14, 2024
Struggling to get correct token count Community gpt-4 , gpt-35-turbo , api	2	1950	September 4, 2023

BATCH Api errors max_tokens is too large

Related topics