How does Assistant API (ChatGPT System) handles long context (aggregation of prompts & responses) in a Thread?

mittalh944 · April 20, 2024, 2:29am

Hi there! I’m Harsh & I’m training Small Language Models on my own to get an experiential learning.

I was going through the assistant’s API & saw a new approach to preserve the context (aggregating prompts & responses) & enhance the model output by an additional ‘system’ tag (with an additional system information/description).

Here’s a situation to explain my question better.

Imagine an LLM that’s powering the ChatGPT has a context window of 100 tokens. Consider the below situation.
- User prompts an input with 80 tokens & gets a response of 30 tokens.
- User follows up with a prompt of 50 tokens.
- I’m assuming when user follows up with 50 tokens, a master prompt is being sent to the model that contains the history (prompt_1, response_1 & prompt_2)

It can be clearly see that the context length has exceeded (because of aggregating the previous prompt & response pairs.

So what’s happening at the backend? Is the master_context got truncated?

Looking forward to it.
Cheers!

Topic		Replies	Views
How to build a Question and Answer Bot for context greater than 2048 tokens? Prompting	3	1781	December 17, 2023
How Does OpenAI Manage Context Limit in Relation to the System Message? API gpt-4 , chatgpt	1	124	August 14, 2024
How to handle long prompts that exceeds the token limit? API	2	3039	December 25, 2023
Issue with contextual chat bot Bugs gpt-4 , gpt-35-turbo , chatgpt , api	2	448	February 21, 2024
Context in seperate messages, or all in user prompt API gpt-35-turbo , api , function-calling , long-context	0	787	March 3, 2024

How does Assistant API (ChatGPT System) handles long context (aggregation of prompts & responses) in a Thread?

Related topics