Rate Limiting

Token bucket rate limiting at 100 requests per minute per API key, with retry headers.

Limits

Metric	Value
Max requests	100 per minute
Window	Sliding window (token bucket)
Scope	Per API key (per environment)

Rate limit headers

Every API response includes rate limit headers:

Header	Description
`X-RateLimit-Limit`	Maximum requests allowed per window (always `100`)
`X-RateLimit-Remaining`	Requests remaining in the current window

When rate limited

When you exceed the limit, the API returns 429 Too Many Requests:

{
  "error": "Too many requests"
}

The response includes an additional header:

Header	Description
`Retry-After`	Seconds to wait before retrying

How it works

The rate limiter uses a token bucket algorithm:

Each API key starts with 100 tokens
Each request consumes 1 token
Tokens refill proportionally over the 60-second window
If no tokens remain, the request is rejected with 429

The token refill is continuous — you don't have to wait for the full window to reset. If you've used 100 tokens and wait 6 seconds, you'll get ~10 tokens back.

Best practices

Use the SDK — the SDK caches flag configs in memory, so you only hit the API once per flag key (standard) or once per 60 seconds (edge)
Check X-RateLimit-Remaining — monitor this header to detect when you're approaching the limit
Respect Retry-After — when rate limited, wait the specified duration before retrying

Limits

Rate limit headers

When rate limited

How it works

Best practices

On this page