How many tokens is the size of the context window in Open AI Assistant?

ollie2 · April 8, 2024, 12:54am

I can’t find in the documentation where this figure is. I’m developing a chatbot where threads can get very long, so if the token window size is small, I need a solution to ensure past messages that drop out of the window don’t get “lost”… so feeding them back in somehow (maybe a summary then making a new thread seeded with this summary? Sounds tedious, though…)

Thanks

_j · April 8, 2024, 1:56am

If you are actually using the API’s “assistant”, which is an agent framework, then the conversation is managed for you, without possibility of outside controls.

The context length varies by model, from 4k with gpt-3.5-turbo-0613, to 128k with gpt-4-turbo-0125.

If you use AI models by the chat completion endpoint, then you are the one in control of what conversation history is sent with every new chat turn, and can use techniques by having a background AI summarize some of the oldest chat to continue the illusion of memory a bit longer.

ollie2 · April 8, 2024, 2:17am

Is there more information of this on how it’s managed? Is it indeed a sliding window whereby old messages are simply dropped out?

_j · April 8, 2024, 5:37am

If on chat completions, and you send too large an input (along with output reservation with max_tokens) for the model, you will simply get an error. Therefore you must do your own token-counting and decide on your own technique and budget for limiting chat size.

Assistants simply drops old chat… and makes sure you are spending the maximum on every API call.

ollie2 · April 8, 2024, 5:39am

Yeah I’m talking about assistant. Are the standard practice ways to ensure old messages aren’t dropped off? Reinjecting summaries?

_j · April 8, 2024, 5:55am

When using Assistants, you delegate all management to OpenAI. You can, and must, keep adding messages to a thread, but when actually run, you have no idea what threshold of cutoff will be used, and you have no way to edit, alter, or replace messages (and unseen tool history), with OpenAI purposefully disabling a method discovered to at least delete old messages.

Reason #160 why Assistants is unsuitable for anyone that can code (that is, if Assistants didn’t need a whole bunch more code anyway).

Topic		Replies	Views
Max number of tokens a Thread can use equal the Context Length of the used model? API	3	778	December 1, 2023
Assistants API context window? API gpt-4-turbo , assistants-api	2	3021	November 26, 2023
Token consumption: Prompt tokens exponentially increase when using Threads (Assistants) API assistants-api	8	455	September 5, 2024
How to limit input tokens of assistant? API	6	4257	April 4, 2024
How does assistant api thread defy token limit? API gpt-4 , threads , assistants-api	3	1236	February 28, 2024

How many tokens is the size of the context window in Open AI Assistant?

Related topics