Rate limit reached with large documents

_j · September 11, 2023, 8:23am

That gave me the impression that you could be naively sending huge texts directly to the embeddings engine. The endpoint makes an estimation of tokens and denies single requests over the rate limit even before tokens are actually counted or accepted or denied by the AI model.

The first thing odd is that “limit 150,000” on embeddings. Me:

You could actually be hitting the limit if you are letting software batch a whole document at once.

Because the rate limit doesn’t rely on token-counting, you don’t have to be elaborate and actually count real tokens either.

Just put in your own character-based rate limit so you hold back chunks until the next minute if you are approaching a formula like 3 characters = 1 token. Possible that the string of the vector return is also being counted.

Topic		Replies	Views
RateLimitError: Error code: 429 API api	7	2017	August 1, 2025
Rate limit error Tier 2 Account Rate Limit Issues with gpt-3.5-turbo API gpt-35-turbo , api , rate-limit , api-billing , api-rate-limits	6	7636	January 2, 2024
Increasing my rate limits Feedback	7	1840	March 4, 2024
Rate Limit Error: Minute and Daily Limit API gpt-4 , api	8	5797	January 9, 2024
Ingesting 1300 pages of documentation on 3 api calls per minute Community langchain	4	1472	August 12, 2023

Rate limit reached with large documents

Related topics