How to Build API Rate Limiting in Your SaaS

Without rate limiting, a single aggressive user can consume resources meant for all your customers, or worse, run up your AI API bill by thousands of dollars. Rate limiting is not optional for a production SaaS — here's how to implement it correctly.

What to Rate Limit

AI features — most critical. One user making 1,000 GPT-4 calls consumes significant budget.
Export/download operations — large data exports are expensive computationally
Bulk operations — importing 10,000 records is more expensive than 10
API endpoints — especially any public API you expose to customers
Authentication — rate limit login attempts to prevent brute force attacks

The Sliding Window Algorithm

The most practical rate limiting algorithm for SaaS: the sliding window counter. For each user, track how many requests they've made in the last X minutes. If over the limit, reject new requests with a 429 status code.

Implementation with Upstash Redis

Upstash Redis is a serverless Redis service perfect for rate limiting in Next.js and Replit applications. Use the @upstash/ratelimit library:

Initialize: new Ratelimit({ redis: Redis.fromEnv(), limiter: Ratelimit.slidingWindow(10, '10 s') })
Check: const { success } = await ratelimit.limit(userId)
Return 429 if !success

Tier-Based Rate Limits

Different subscription tiers should have different limits. Free plan: 100 API calls/day. Pro plan: 1,000/day. Business plan: unlimited. Store the user's current plan in your database, and select the appropriate rate limit based on their tier.

Build Production-Grade Rate Limiting Into Your SaaS

I take 2 clients per month. Ship your SaaS in 2–4 weeks with a developer who has done it 350+ times.

Start on Fiverr →

Communicating Limits to Users

Always return useful headers with rate-limited responses: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset. Show usage in your dashboard UI so users can see how many requests they've used and when their limit resets.

Graduated Rate Limits by Subscription Tier

Rate limits should reflect the value of each subscription tier — not just protect your infrastructure. Give free users strict limits (10 API calls per minute), pro users comfortable limits (100 per minute), and enterprise users negotiated limits (1000+ per minute). This makes your pricing feel fair and gives users a concrete reason to upgrade. Display the user's current rate limit usage in their dashboard so they can see when they are approaching their ceiling. Visible limits drive upgrade conversations more naturally than hitting walls unexpectedly.