After performing two or three requests, tokens are quickly depleted

We use ChatGPT 4 at Tier 5 level to respond to user inquiries within our application. However, after processing two to three requests, our tokens are quickly depleted, necessitating a bot restart. We aim to find a solution to this issue to ensure more stable performance.

Welcome to the community @nagima.semchukova

These are tier-5 rate limits per docs:

MODEL RPM TPM
gpt-4 10,000 300,000

The context length of gpt-4 is 8k. Even if you were to make 30 requests/min while consuming full context, i.e., 10x of what you’re doing, you’d have 60k tokens of TPM left.

Thus, I’d recommend checking your usage costs and activity.

See if they align with your usage patterns. If they don’t, immediately create a new API key and revoke all previous API keys.

2 Likes

Thank you very much for such a prompt response! I’ll try to check our usage costs and activity and come back here with the information.