Rate limiting is one of those problems that looks simple until you try to implement it correctly at scale. Fixed window counting is the obvious first approach. Sliding window is what you actually want. Here's what we built and why.
Fixed window vs sliding window
Fixed window counting resets at a hard boundary — midnight, or the start of each minute. The problem: a user can make their full quota in the last second of one window and the first second of the next, effectively doubling their rate for a brief period. At scale, this causes spikes.
Sliding window counting looks back a rolling period — e.g., “how many requests in the last 60 seconds from this user?” This eliminates boundary spikes but requires storing a timestamp for every request, which is expensive at high throughput.
@upstash/ratelimit's sliding window implementation uses a two-bucket approximation: it keeps the current window count and the previous window count, then interpolates based on how far through the current window we are. It's not perfectly accurate but gives a smooth rate limit with O(1) Redis operations.
Implementation
import { Ratelimit } from "@upstash/ratelimit";
import { Redis } from "@upstash/redis";
const redis = Redis.fromEnv();
export const rateLimiters: Record<Plan, Ratelimit> = {
free: new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(500, "30 d") }),
starter: new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(10_000, "30 d") }),
growth: new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(100_000, "30 d") }),
scale: new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(1_000_000, "30 d") }),
enterprise: new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(999_999_999, "30 d") }),
};The rate limiter key is the user's UUID — not their API key. This means all keys for the same user share a quota, which is correct: a user shouldn't be able to bypass their plan limit by creating multiple API keys.
The auth middleware in one call
Every API route calls authenticateApiRequest(req)which does, in sequence: look up the API key in Supabase, check the key is active, look up the user's subscription plan, run the rate limit check against the plan's Upstash limiter, and check the monthly quota against the usage log count.
This is 3 Supabase queries + 1 Redis operation per request. It's fast enough in practice (the Supabase operations are sub-10ms on the same region), but it's the obvious place to add caching. A 60-second in-memory cache of the API key lookup would eliminate 2 of the 3 Supabase calls for hot keys.
The Supabase schema
Two tables drive the system:
lapi_api_keys—id,user_id,key(unique, indexed),is_active,last_used_at,created_atlapi_usage_logs—id,user_id,api_key_id,namespace,endpoint,method,status_code,response_ms,ip_address,created_at
Usage logging is fire-and-forget. We call logUsage(...).catch(() => {}); after returning the API response so it never blocks the request path. If a log write fails, we swallow it — the developer still gets their response.
What we learned
The sliding window model feels fair to users — there are no hard resets that suddenly block them mid-day. The Upstash free tier is generous enough to run the entire platform in development and early production without paying for Redis. And keeping the rate limiter in a single middleware function means every new endpoint gets rate limiting for free — no per-route implementation required.