This model's maximum context length is 8191 tokens, Even when using gpt-3.5-16k

knowledgeark.ai · July 1, 2023, 11:41am

from langchain import OpenAI
from llama_index import LLMPredictor, GPTVectorStoreIndex, PromptHelper


llm_predictor = LLMPredictor(llm=OpenAI(model="gpt-3.5-turbo-16k" , max_retries=3))
prompt_helper = PromptHelper()
custom_LLM_index = GPTVectorStoreIndex(
    documents, llm_predictor=llm_predictor, prompt_helper=prompt_helper
)

Error
This model’s maximum context length is 8191 tokens, however you requested 16296 tokens (16296 in your prompt; 0 for the completion). Please reduce your prompt; or completion length.

even though the maximum context size for the 16k model is 16385

How am I supposed to find a workaround for this?

Thanks

Foxalabs · July 1, 2023, 12:19pm

If you are getting an error message saying 8192, can you please post the code used and a log of the error returned as that should not be happening.

The 16k model has a 16k context, for general use consider this number for be 16,000 as the system uses some tokens for internal use, if you absolutely need… lets say 16100 tokens that you need to experiment and ensure your application works across use cases.

That being said, the 16k is the maximum reply length and must also include your prompt, i.e. a prompt of 2k plus a response of 16k would be 18k and would cause an error. In the case of a 2k prompt you should set the reply token amount to 14k.

You can make use of the tiktoken library to calculate token numbers if you need to be precise.

knowledgeark.ai · July 3, 2023, 6:07am

import os
from langchain.chat_models import ChatOpenAI
from langchain import OpenAI
from pathlib import Path
from llama_index import download_loader


os.environ['OPENAI_API_KEY'] = "..."


PandasCSVReader = download_loader("PandasCSVReader")
loader = PandasCSVReader()
documents = loader.load_data(file=Path('trial.csv'))

from llama_index import LLMPredictor, GPTVectorStoreIndex, PromptHelper


# define LLM
llm_predictor = LLMPredictor(llm=OpenAI(model="gpt-3.5-turbo-16k" , max_retries=3))


prompt_helper = PromptHelper()


custom_LLM_index = GPTVectorStoreIndex(
    documents, llm_predictor=llm_predictor, prompt_helper=prompt_helper
)

query_engine = custom_LLM_index.as_query_engine()

response = query_engine.query("Question Related to CSV")

print(response)

Yeah I basically need to query this csv file daily (it updates with new data), so 8k is unable to deal with that size of data, even though 16k would’nt be enough, but still a bit better.

Foxalabs · July 3, 2023, 6:13am

Ok, well, you need to leave room for the reply and the data you send to it, so you will have to truncate any input down such that you leave enough room for response.

knowledgeark.ai · July 3, 2023, 6:21am

Yeah, I was gonna try that as well but I’m still in the unknown why has it been giving me the maximum of 8k context size error

Foxalabs · July 3, 2023, 6:24am

I think that is a bug in the error message text, i.e. it got copy pasted from an 8k model and just not updated yet. Myself and others have done extensive testing on the 16K context and it is absolutely 16k OpenAI’s offerings are always looked over with a fine toothed comb.

Topic		Replies	Views
Error: This model's maximum context length is X tokens API gpt-35-turbo	5	8773	September 1, 2023
Token Limitization Error when prompting Prompting chatgpt , api	8	1492	December 6, 2023
Maximum Context Length Error across different models API	3	1868	December 4, 2023
How can the OpenAI model's max token length error be resolved? API prompt	6	2734	July 4, 2023
Maximum Context Length Error with gpt-3.5-turbo-16k Models API gpt-35-turbo , api	5	5707	December 15, 2023

This model's maximum context length is 8191 tokens, Even when using gpt-3.5-16k

Related Topics