API
Rate Limiting
Token bucket rate limiting at 100 requests per minute per API key, with retry headers.
Limits
| Metric | Value |
|---|---|
| Max requests | 100 per minute |
| Window | Sliding window (token bucket) |
| Scope | Per API key (per environment) |
Rate limit headers
Every API response includes rate limit headers:
| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum requests allowed per window (always 100) |
X-RateLimit-Remaining | Requests remaining in the current window |
When rate limited
When you exceed the limit, the API returns 429 Too Many Requests:
{
"error": "Too many requests"
}The response includes an additional header:
| Header | Description |
|---|---|
Retry-After | Seconds to wait before retrying |
How it works
The rate limiter uses a token bucket algorithm:
- Each API key starts with 100 tokens
- Each request consumes 1 token
- Tokens refill proportionally over the 60-second window
- If no tokens remain, the request is rejected with
429
The token refill is continuous — you don't have to wait for the full window to reset. If you've used 100 tokens and wait 6 seconds, you'll get ~10 tokens back.
Best practices
- Use the SDK — the SDK caches flag configs in memory, so you only hit the API once per flag key (standard) or once per 60 seconds (edge)
- Check
X-RateLimit-Remaining— monitor this header to detect when you're approaching the limit - Respect
Retry-After— when rate limited, wait the specified duration before retrying