Why GPT-4-1106-preview input token is still limited to 32K? (Tier 2)

Hi, I am keeep getting this error:

openai.BadRequestError: Error code: 400 - {‘error’: {‘message’: ‘1 validation error for Request\nbody → content\n ensure this value has at most 32768 characters (type=value_error.any_str.max_length; limit_value=32768)’, ‘type’: ‘invalid_request_error’, ‘param’: None, ‘code’: None}}

I used to get this 32K limit message while I was in Tier-1, but after my account has been upgraded to Tier-2 I still get this same limit message?

I thought the limit would increase as Tier goes up? Am I missing something?

Hi! are you absolutely sure you’re calling the right model?

Here’s my code that I’ve been using.

I am using Assistant Request by calling assistant_id which I pre-configured on OpenAI’s website. The Assistant is using the base model of GPT-4-1106-preview. Any ideas…?

openai.api_key = '...'

assistant_id = '...'

def create_thread(ass_id,prompt):
    #Get Assitant
    #assistant = openai.beta.assistants.retrieve(ass_id)

    #create a thread
    thread = openai.beta.threads.create()
    my_thread_id = thread.id


    #create a message
    message = openai.beta.threads.messages.create(
        thread_id=my_thread_id,
        role="user",
        content=prompt
    )

    #run
    run = openai.beta.threads.runs.create(
        thread_id=my_thread_id,
        assistant_id=ass_id,
    ) 

    return run.id, thread.id


def check_status(run_id,thread_id):
    run = openai.beta.threads.runs.retrieve(
        thread_id=thread_id,
        run_id=run_id,
    )
    return run.status



my_run_id, my_thread_id = create_thread(assistant_id, complete_prompt)


status = check_status(my_run_id,my_thread_id)

while (status != "completed"):
    status = check_status(my_run_id,my_thread_id)
    time.sleep(2)


response = openai.beta.threads.messages.list(
  thread_id=my_thread_id
)


if response.data:
    print(response.data[0].content[0].text.value)```

The limitation (and many others) is that of the assistants framework. Characters that can be placed in a message are counted and limited.

Use chat completions. Place whatever data you want up to 125k context length of the model (while leaving some of that for a response).

Try gpt-4-0125-preview. Get less errors in employing functions.

Use functions, not tools. Save tokens, save sanity. Reject tool language injection from OpenAI.

1 Like