I’ve divided a text into 25,000-word segments using Python, which is the stated word count capacity that GPT-4 claims to manage. However, I receive an error message: “The message you submitted was too long, please reload the conversation and submit something shorter.”
I executed the Python script again to break the text into 20,000-word segments. Yet, GPT-4 produced the same error message. I then ran the Python script once more to split the text into 15,000-word segments - still receiving the same error!
Could someone kindly assist me in resolving this issue or offer some insight into why GPT-4 is unable to process the advertised 25,000-word limit?
The claim’s validity is not in question. I think the version we currently have doesn’t have a higher limit just as it doesn’t have image / video input yet. This is just the text version of the new model not a complete version.
if you will be using that much text, anyway it is better to use MS Azure Services OpenAI… or OpenAI paid servers (when it is available)… / anyway all these questions from just created accounts, with “Dear GPT Community”… only chatGPT writes that sorry… but is true… or “20,000” instad of “20k”…
Have they rolled out the 32k token model yet in the API? Anything seen in the wild on this?
At the announcement, they said they would roll out the 8k token version first, then 32k version later.
Also, you have to pick the version in the API call, it doesn’t auto-scale for you if you have higher than 8k tokens. So if the OP or others stating you can’t get 25k words, first show me the API call you used, in particular the model name.
And since it uses tokens, not words, you need to count the input and output tokens with the tiktoken library, and call out the correct byte-pair encoding engine specific to the model.
If you aren’t using the API, but using GPT-4 (through Playground), I believe you are only using the 8k token model.
Remember the rough rule-of-thumb is that X tokens is about 0.75*X English words. This is only rough though, and so tiktoken is used to get the accurate token amount.
this is from waitlist page… are you sure you have already access to 32k model?
note: “…We are processing requests for the 8K and 32K engines at different rates based on capacity, so you may receive access to them at different times…”
“During the gradual rollout of GPT-4, we’re prioritizing API access to developers that contribute exceptional model evaluations to [OpenAI Evals] to learn how we can improve the model for everyone. We are processing requests for the 8K and 32K engines at different rates based on capacity, so you may receive access to them at different times. Researchers studying the societal impact of AI or AI alignment issues can also apply for subsidized access via our [Researcher Access Program].”
you can ask ChatGPT
As an AI language model, I can’t access the web in real time. My knowledge is based on the information available up until September 2021, so I don’t have the specific details regarding the current GPT-4 API waitlist or model access. However, OpenAI tends to provide different tiers of access depending on the user’s needs. Some users might have access to the 32k model, while others might not. Access to models can be determined by factors such as subscription plans, usage limits, and user agreements.
For the most up-to-date information on GPT-4 API access and the available models, please visit OpenAI’s website or contact their support team.
I’ve just adjusted the word limit to a mere 5,500 words to align with the 8K token model, following the adage “X tokens is about 0.75*X English words” kindly shared by @curt.kennedy. However, I’m still encountering the same pesky error message.
Currently, I’m utilising GPT-4 via my ChatGPT Plus subscription, on API waitlist. Would anybody happen to have any further suggestions to attempt? If not, I suppose I shall persist in reducing the word limit until the prompt is successfully processed.
it seems the limit is 4k… / you wrote about using Python, I assumed you were using really API of chatGPT… you are using python only to split the text? try with less than 4k… and then the question will be what happens when you send the next msg… because it will send the first one again…
now, this is interesting, because it shows (I’m self promoting here, anyway won’t mention any name) but you can see how UX here can solve all of this… / the forum has a channel called “ChatGPT”… there is no difference between the “app” and the API… so all msgs are in the same channel / also if you receive a Summary of the forum, it will jump to the specific msg, and the channel name is using a really small font (you can read it, but clearly not made by a designer)
now, the app, could easily “know” without using AI, just a simple len(text) and assuming a word would have X chars as avg… and already tell you that there is no way, that prompt/text of 30K can be processed (or use the JS token count to calculate it…)
that is what I see everywhere in the 100s of things that were released in the last days / GPT4 is amazing, testing it is way better than 3.5 (more expensive, at least atm)… but it will be used by Humans… so you need a good UX… even if you talk/write/draw what you want it to do…
there are other posts about that, it is model gpt 4 but it doesn’t know about it… ask it a very complex question and it will answer it / you can see in network tab that it is sending model “gpt-4” (or something like that)
If you’re using python, there is a python library that counts tokens for you. I can’t remember what it’s called but I was playing with it on the weekend. I had setup a python server to count tokens from my node app. Although I’m not on my machine that I was playing on and I can’t remember what it was called. You could find it