Hi everyone,
I’m building a SaaS product and currently working on my MVP. In my tests, everything works fine under normal or light usage — for example, simulating many users at the same time with low activity runs smoothly.
However, when I simulate real-world heavy usage (the kind I expect as soon as I get a serious client), I immediately hit a problem: rate limit errors (429).
One of the most API-intensive features in my SaaS is analyzing the text content of web pages. This morning, I triggered a 429 rate limit after launching a bulk job of 200 pages (requests sent one by one, not in parallel).
I’d love to know:
-
How do you manage this when scaling your product?
-
For those who’ve already scaled successfully, what strategies worked for you?
-
Are there recommended architectural patterns or quota management techniques for these kinds of workloads?
Thanks in advance for sharing your experience!