I have tier 4 usage so my API key has access to the full 128k token limit but I can’t seem to access that on my assistant bot as I keep getting the error "ensure this value has at most 32768 characters"
and I have ensured it’s using gpt-4-1106-preview
in the assistances model selection. Is there a hard cap for the assistances atm or is there something else I have to do?
Did you end up figuring this out? I’m running into the same thing
Same question. Need 128k tokens.
On the assistant, data will be chunked into smaller chunks as far as the data is related to each other. It will be picked up as a reference automatically. For most of the cases this way will work much better than using context window. The more context length become the less reliable output will be.
Did you work this out, I’m having the same issue?
Sory for the dumb questions but what is “tier 4 usage …”?
Hi!
The tier level refers to the number of tokens per minute and requests per minute/day. Tier 4 means the user has already paid a good amount for credits.
You can learn more here:
https://platform.openai.com/account/limits
Yes, 128k is available for me now via API.
Thanks, makes sense.
But an option to simply pay and jump up tiers might be useful maybe. This is on the assumption higher tier will increase the speed of the model’s response. At least this way I would get a sense of the difference in speed between tiers.
That approach may be silly as the cost to be very high; guess what I ideally want is to be able to see what best available performance looks like and what it might cost.
Maybe you can find some experience reports from users with access to Enterprise level accounts and learn if paying a six digit amount actually gets faster responses times.
Otherwise, don’t expect anything to change, for now.
Ouch! but thanks
Everything ok your end?
It’s ultimately just a few days of waiting time to move up the Tier level. If you have the demand I suggest to go for it.
Alternatively, you can also check out Microsoft’s Azure offering.
I believe many of us are waiting and hoping for OpenAI to deploy tons of GPUs or specialized new processor generations for faster and more stable inference times.
Overall, I consider the completion endpoints to be OK, given the current situation.
Appreciate your feedback. I certainly share that hope but yes, will look at the M$ offering. Just for info, using the Assistants API.
Thanks.