Assistant Started Hitting TPM Limit With No Changes to Implementation

v_for_vasquez · October 31, 2024, 3:51pm

Hi all! I’ve been working on a RAG app that gets a ticket, then reads through some documents to find the proper billing code. It has been working pretty great but for some reason yesterday I stopped getting responses on one of my test accounts. I looked up the thread and found that I was now getting this error:
" Run failed

Rate limit reached for gpt-4-turbo-preview in organization org-[org-id] on tokens per min (TPM): Limit 30000, Used 4811, Requested 28049. Please try again in 5.72s. Visit https://platform.openai.com/account/rate-limits to learn more.

I hadn’t changed anything on my implementation and it is only broken on a few threads, so I am confused as to why this is happening.

For clarity the prompt for the test is: “Explain [billing code]” . If I just ask something that doesn’t require a lookup it seems to be working as expected, e.g. “Hello!”.

Has anyone else ran into this? Do I need to stop reusing threads for a user or not use them for so long? I thought the docs said they auto-trimmed but maybe I’m mistaken.

Thanks for any help or clarification!

v_for_vasquez · October 31, 2024, 3:53pm

Additional Note

My implementation is:

User clicks a code button
App automatically creates message of “Explain the [code number/name] code”
I append the message to the existing thread
I run the thread.

Topic		Replies	Views
Rate limiting but I've run nothing... and I'm getting charged - what's up? API assistants-api	3	1338	February 8, 2024
Assistant Thread limitations API gpt-4 , api , assistants-api	5	1162	July 30, 2024
RPM rate limits at 60 when using gpt-4 with Assistant API API api	3	1362	February 28, 2024
Rate_limit_exceeded error when we only do one transaction at a time API rate-limit , gpt-4o-mini	7	484	November 19, 2024
GPT-4o Assistant Thread Length Limit? API playground , limitations , threads , assistant , gpt-4o	9	11184	July 19, 2024

Assistant Started Hitting TPM Limit With No Changes to Implementation

Related topics