When I am trying to use the Assistants api I have kept getting this error
“last_error”: {
“code”: “rate_limit_exceeded”,
“message”: “You exceeded your current quota, please check your plan and billing details.”
},
But it works fine when using the chat api.
why?
rate limit exceeded → the assistant is taking to much context length and looping and burning tokens until maxing out your account’s token-per-minute limit.
The rate limit saved your account from being emptied by this thing.
Not for experimentation. No success stories. Only $5 questions and $200+ an hour. Do not use.
What Tier level is your account? and what are you attempting to do with the API?
Assistants api is used by another account, just want to test new features
Not sure I follow, what do you mean “another account” do you own both accounts?
Yes. When I use the Assistants api with another account it tells me "You exceeded your current quota, please check your plan and billing details. "But this is normal when using the chat api.
Are you saying the Assistants API is flawed because it is too expensive and can too easily run away with itself causing too much cost?
So in fact, I did not understand the meaning of his answer.
Are you using multiple API keys within the same application?
Basically yes. Unlike ChatGPT where the conversation is clipped to where people complain it doesn’t remember anything, assistants will run up the conversation to the maximum of the model when you continue to chat. 128k.
Also, when running your own vector database, for example with 1MB of your company’s tech support knowledge base and product offerings, you might have a threshold where only the top 5 chunks are fed to the AI, and only if they meet a semantic similarity threshold. Not the case with assistants - if you ask “how’s your day going”, the AI gets maximum retrieval placed into the context window.
Those are prices and anecdotes taken right from the forum. The AI looping until it hits your API rate limit and you get no answer. AI looping, calling your API over and over with the same query.
Until they offer transparency about billing and realitime per-call token usage, and allow controls over data and iterations similar to what a reasonable person may program themselves, I would have to say “program yourself”.
No, I’ve only used that one key in the same application
Thanks, useful info.
You nailed it - my “home grown” bot picks a limited set of the semantic results currently and that limits the context/cost impact. It also has a failsafe so it doesn’t loop over a certain amount of times. I’m surprised there isn’t this safeguard?! That’s a showstopper for Production adoption of the Assistants API, surely?!
Sorry to take the Topic off on a tangent … but that’s critical information
Clearly, Assistants API service needs more work and more thought applied to it - but I guess that’s the point of the “preview” phase …
Ok, I see, so how are you accessing another accounts assistants?
Fair enough (but only for two more days?!).
But this is basic stuff. Come on Open AI!
Log in to another account to perform the operation. I’m not sure that’s what you want to ask.
So we need to limit the search before we do it? What should I do?
Ok, there is no login in with the API, you specify an API key and that is your credentials. Are you making API calls or using GPTs or… can you post a code snippet of your API calls please.
Wait until Open AI fixes this and improves the approach/algorithm before pushing this live.
I followed the api documentation, and the code was pretty much the same.