Token/character restrictions for assistants-api?

jr.2509 · November 18, 2023, 6:45am

Feels a bit off to post a technical question given what’s just happened but here we are.

As of today, I keep getting an error when using my assistants api that no more than 32768 characters are supported in a single message/rquest body in a thread. The specific error shown is as follows and it occurs both in the app and in the playground.

1 validation error for Request body → content ensure this value has at most 32768 characters (type=value_error.any_str.max_length; limit_value=32768)

This seems rather odd as GPT-4-turbo has a much larger context window and would be a significant constraint in the use of assistants api.

Does anyone have any insights into this? Thanks!

_j · November 18, 2023, 7:31am

There’s an error with the API itself where transmission of 32768 characters is seen as a limit. That would seem to make direct interactions to obtain “128k” impossible by chatcompletion. (edit, the chat API works fine, this comes from “assistants”)

One would think this is overcome by server-side resources such as uploaded files or calling upon threads (server-side chat history) with just a new user question, but finding an API limitation and lifting it only for those who want to lose control of their spending seems not just disingenuous but malicious promotion.

Like rate limits, this could just be one of those “oh, we should have increased that spec too” things that will be lifted, but needing a new API spec.

vnihoul77 · November 29, 2023, 8:32pm

Wait, does that mean that the 120k tokens announced is… fake? What other way do we have to use the 120k token limitation with their API then?

It just feels like we spent a few hours building an entire use-case around their API to finally discover the specs are not the one announced?

jr.2509 · November 30, 2023, 4:28am

I think it is just a temporary constraint as part of the assistant’s beta phase. 128k tokens work fine otherwise for GPT-4 turbo

cyzgab · November 30, 2023, 4:32am

Does the tokens from the retrival step, which are added to the prompt, count towards the 128k token count?

_j · November 30, 2023, 4:39am

Yes, everything the AI must know in order to provide a final answer must be placed into the AI context length.

I wouldn’t call it “count toward”, but rather just part of the input to the AI model that goes along with instructions, function definitions, chat history, current messages, past results of functions like code interpreter, and the undocumented methods used internally to fill up the AI context (as full as possible) with uploaded and attached files.

This topic’s report of a character limit encountered is just in placing the messages yourself, while the assistants backend has lots of other internal ways to make sure the model context is filled to the brim.

It is more informative to work directly with the models, where you are the one making these decisions in your code about how much chat is necessary for “memory”, and how external knowledge will be added, or how features external to the model will be called with internal turns.

cyzgab · November 30, 2023, 5:55am

Yea - so it sounds like when using the assistants’ api, the chosen model’s context window can be decomposed to:

Token length of the input prompt
Token length of retrieved context
Token length of the output

?

And if so, perhaps the 32768 char limit for the input is one stop-gap to limit #1 so that there are enough tokens for #2 and #3.

So it’s not that the 120k token limit is fake : )

john_boy · December 9, 2023, 8:17am

Experiencing the same problem here. I hope they fix it soon although it has already been a while.

ak42 · January 16, 2024, 11:26pm

function outputs don’t seem limited. you could use that as a workaround I guess.
I’m feeding the assistant with huge web pages (>32768) and it is still OK.

stereosmarty · January 25, 2024, 9:38am

What API do you use? I’m still getting this error when creating a thread

ak42 · January 25, 2024, 3:27pm

I’m using the assistant api.
whenever a function is called by the assistant, you are supposed to provide the function output through submit_tool_outputs
https://platform.openai.com/docs/api-reference/runs/submitToolOutputs

Topic		Replies	Views
Undocumented truncation of function tool call submission output API function-calling , assistants-api	3	892	February 14, 2024
Problem with Assistant - max length 32K even with the gpt-4-1106-preview and trying to get the status I get a passed three arguments error even though I only passed two API assistants	9	3206	December 16, 2023
Why GPT-4-1106-preview input token is still limited to 32K? (Tier 2) API gpt-4	3	1184	January 27, 2024
Large JSON Responses from Assistant API are truncated API json , assistants-api	5	1764	June 20, 2024
API token limitation differs from website UI token limitation API	4	642	December 18, 2023

Token/character restrictions for assistants-api?

Related topics