Number of words/tokens to be passed on input prompt

kamleshjha · September 11, 2023, 7:03am

can you please suggest that if i want to input 500000 words at once in chatgpt prompt , how can i do it ?

also the overall input +output +context size is going to be 1 million words … is it possible to process such huge number of words/tokens in one attempt ?

_j · September 11, 2023, 7:30am

Why Extremely Large Documents Can’t Be Processed: ChatBots like ChatGPT and AI models behind them like GPT-4 have limitations on the length of text they can process effectively.

(don’t worry, I essentially had to also teach ChatGPT how to answer about the absolute limit of internal memory of language model AI, not saving me any time…)

The “context window length” in models like GPT-4 refers to the maximum number of BBPE tokens that can be considered by the model for both input and output. Here’s how this works:

Input Text: When you provide input text to the model, it occupies a portion of the context window. The model reads and processes this input text, and it can only consider a certain number of tokens from it, up to the context length limit of 8192 BBPE tokens.
Output Generation: The remaining portion of the context window is reserved for generating the model’s response or output. This means that the model not only uses part of the context window for understanding your input but also needs space to formulate its response. The available space for generating the response is determined by the remaining tokens within the context length limit.

In essence, the context length is a shared resource between input and output. Both the input text you provide and the response the model generates must fit within this context window. If the input text is very long, it leaves less space for the model to generate a response, potentially affecting the quality and comprehensibility of the answer.

Therefore, when using models like GPT-4, it’s essential to be mindful of the context length limit. If you have a lengthy input text, you may need to shorten it to ensure there is enough space for the model to generate a coherent and meaningful response within the context window. Managing this balance effectively is crucial for obtaining optimal results in your interactions with the AI model.

udm17 · September 11, 2023, 7:30am

This won’t be possible in one go. The context length for a model is basically the total number of input + output tokens it can handle at one time.

For such a context, using something like embedding the context text and only using the relavent text using semantic matching may work

Topic		Replies	Views
Why does the prompt not respond when I input more than 10,000 characters? Prompting gpt-4	2	826	May 15, 2023
Long Prompt with Large Text Data Prompting gpt-35-turbo , chatgpt , api	4	6579	April 8, 2024
I am looking for the way to use large system prompt context for gpt-4 Prompting gpt-4	1	426	October 30, 2023
16k Input vs Output: Edit and token strategies for long input texts Prompting gpt-35-turbo , python	2	1176	December 17, 2023
How to handle long prompts that exceeds the token limit? API	2	2138	December 25, 2023

Number of words/tokens to be passed on input prompt

Related Topics