AX/GLOSSARY
EngineeringUpdated: Apr 15, 2026

Rate limiting

Limiting requests per time period — protects APIs from abuse and overload. Token bucket, leaky bucket, fixed window. Critical for external API integrations.

Rate limiting is limiting requests per time period — a fundamental technique to protect APIs from abuse, overload, and unfair usage. Every serious API has rate limits.

Main algorithms:

  • Token bucket — bucket with N tokens, each request costs 1, bucket refills R/sec. Allows bursts, smooth long-term rate.
  • Leaky bucket — requests enter the bucket, "leak" at constant rate. Smooth output, drop overflow.
  • Fixed window — counter per minute/hour, reset at start of window. Simple but has "burst at boundary" problem.
  • Sliding window — rolling window, most accurate but more computationally expensive.

Typical limits (2026):

  • Stripe API: 100 req/sec (live mode)
  • OpenAI API: 500-10,000 req/min depending on tier
  • Slack API: 1 req/sec per method (Tier 1)
  • LinkedIn: 500 actions/day (very restrictive)

Headers to recognize: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, Retry-After. Production integrations should respect these headers, not rely on hardcoded limits.

What to do when hitting limit: exponential backoff retry, do NOT retry immediately — that escalates the problem. For bulk operations: spread requests evenly, plan around limits, use batch endpoints if available.