Protecting Auth Endpoints: Rate Limiting and Brute-Force Defense
How to design rate limiting for login, OTP, and reset endpoints — algorithms, what to key on, lockout strategies, and avoiding the traps that let attackers through or lock out real users.
Every authentication endpoint is a guessing game an attacker would love to play at machine speed. Rate limiting is how you slow that game down to the point where it isn't worth playing. Done well, it stops brute-force and credential-stuffing attacks without ever inconveniencing a real user. Done badly, it either does nothing or locks out the very people you're trying to protect.
What you're defending against
- Password brute-forcing — trying many passwords against one account.
- Credential stuffing — trying many leaked email/password pairs across many accounts.
- OTP / code brute-forcing — a six-digit code is only a million possibilities; without limits it falls in seconds.
- Reset/enumeration abuse — hammering the "forgot password" or "resend code" endpoints to spam users or probe for valid accounts.
Notice these have different shapes. Per-account limits stop brute-forcing one account; per-IP limits stop one machine hitting many accounts. You need both.
Choosing a rate-limit algorithm
Fixed window
Count requests in each clock-aligned window (e.g. per minute). Simple, but vulnerable to bursts at the window boundary — an attacker can send a full quota at 0:59 and another at 1:00.
Sliding window
Smooths the boundary problem by weighting the previous window. More accurate, slightly more state.
Token bucket
A bucket refills at a steady rate up to a maximum. Each request spends a token; an empty bucket means rejection. This allows short legitimate bursts while enforcing a sustained average — usually the best fit for auth endpoints.
For most applications, a token-bucket or sliding-window limiter backed by a fast shared store (Redis or equivalent) is the right default. The store must be shared across all your servers, or attackers simply spread requests across instances to dodge per-node counters.
What to key the limit on
Choosing the right key is more important than the algorithm:
- Per IP — catches a single source hammering you. But beware: many legitimate users share an IP (corporate NAT, mobile carriers), and attackers rotate IPs cheaply. IP alone is necessary but not sufficient.
- Per account / username — catches brute-forcing of a specific account regardless of source IP. Essential for login.
- Per IP + account combination — fine-grained and useful for distinguishing targeted attacks.
- Global / endpoint-wide — a circuit breaker for the whole endpoint under a distributed attack.
A layered approach — combining per-account, per-IP, and global limits — is far more robust than any single key.
Lockout strategies
When a threshold is crossed, you have options, from gentle to strict:
- Throttling / exponential backoff — each failed attempt increases the required wait. Smooth and self-correcting; usually the best user experience.
- CAPTCHA challenge — after a few failures, require proof of humanity. Stops bots while letting real users continue.
- Temporary lockout — block further attempts for a fixed period. Effective, but be careful: a naive per-account lockout becomes a denial-of-service tool — an attacker can lock a victim out of their own account on purpose.
Prefer throttling and CAPTCHA over hard account lockouts. If you do lock accounts, lock the attempt source rather than the target account where possible, and always give the legitimate user a recovery path.
Don't forget the secondary endpoints
The single most common mistake: rate-limiting the login form and nothing else. Apply limits consistently to every auth-adjacent endpoint:
- Sign-in
- OTP verification (critical — this is the most brute-forceable)
- OTP / magic-link request (to prevent email/SMS flooding and cost abuse)
- Password reset request and submission
- Token refresh
- Sign-up (to limit automated account creation)
An unprotected OTP-verify endpoint quietly undoes all the work you did protecting the login form.
Make responses safe, too
Rate limiting interacts with information disclosure. Keep responses uniform so the limiter itself doesn't leak which accounts exist or which codes were "closer." Return a clear, generic error and the standard 429 Too Many Requests status with a Retry-After header so well-behaved clients can back off gracefully.
HTTP/1.1 429 Too Many Requests
Retry-After: 60
Observability
Rate limiting is also a detection surface. Spikes in 429s, a flood of failed logins across many accounts, or unusual geographic patterns are early warnings of an attack in progress. Log limiter decisions, alert on anomalies, and feed the data back into tuning your thresholds.
A sensible default configuration
- Login: per-account limit (e.g. 5 failures → exponential backoff, CAPTCHA after 10), plus a per-IP limit, plus a global circuit breaker.
- OTP verify: strict — a handful of attempts per code, then the code is burned and a new one required.
- OTP/magic-link/reset requests: a few per account and per IP per hour to prevent flooding.
- Refresh: modest per-token limits, with rotated-token-reuse detection layered on top.
- Shared store across all instances;
429withRetry-After; uniform error messages; full logging.
Rate limiting isn't a feature you bolt on after a breach — it's a structural property of a healthy auth system. The goal is asymmetry: make each guess so cheap for you to reject and so expensive for an attacker to attempt that brute force simply stops being viable.
Written by
Emilian GheoneaSenior Blockchain & Full-Stack Software Engineer. I build EmbedAuth — an embeddable authentication platform for SaaS — and write about the auth problems most teams hit too late.
Related articles
Security
Common Authentication Vulnerabilities and How to Prevent Them
Credential stuffing, broken session handling, JWT confusion attacks, account enumeration, and the other ways login systems get broken in production — plus the concrete fixes for each.
May 28, 2026 4 minReadAuthentication
Session Management Best Practices: Cookies, Tokens, and Rotation
Stateful sessions vs. stateless tokens, cookie security flags, refresh token rotation, idle and absolute timeouts, and how to revoke access when it matters.
Jun 10, 2026 4 minReadOAuth & OIDC
How OAuth 2.0 Actually Works
OAuth 2.0 explained without the marketing copy — what the redirects are really doing, where the tokens come from, what PKCE protects against, and the parts the spec doesn't make obvious.
May 10, 2026 9 minReadAuthentication
What Is Embedded Authentication?
Embedded authentication keeps users in your product when they sign in, instead of bouncing them to a third-party domain. Here's how it works, what it changes, and the tradeoffs you sign up for.
May 12, 2026 7 minRead