Skip to content

Rate Limits

Claudexia enforces rate limits to ensure fair usage and API stability. This page explains defaults, configuration, and how to handle 429 errors.

Default limits

New API keys start with these default limits:

MetricDefault limit
Requests per minute60
Tokens per minute100,000
Concurrent requests10

Per-key configuration

Rate limits can be adjusted per key from Dashboard > API Keys.

Configurable limits:

  • Requests per minute (RPM)
  • Max tokens per request
  • Daily / weekly / monthly budget

Rate limit headers

Every API response includes rate limit headers:

HeaderDescription
X-RateLimit-LimitMaximum requests allowed in the current window
X-RateLimit-RemainingRequests remaining in the current window
X-RateLimit-ResetTime when the rate limit window resets (Unix timestamp)

Handling 429 errors

When you hit a rate limit, the API returns HTTP 429. Use exponential backoff to retry:

python
import time
import requests

def request_with_backoff(url, headers, data, max_retries=5):
    for attempt in range(max_retries):
        response = requests.post(url, headers=headers, json=data)
        if response.status_code == 429:
            wait = 2 ** attempt  # 1, 2, 4, 8, 16 seconds
            print(f"Rate limited. Retrying in {wait}s...")
            time.sleep(wait)
            continue
        return response
    raise Exception("Max retries exceeded")
typescript
async function requestWithBackoff(
  url: string, options: RequestInit, maxRetries = 5
) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const res = await fetch(url, options);
    if (res.status === 429) {
      const wait = 2 ** attempt * 1000;
      console.log(`Rate limited. Retrying in ${wait}ms...`);
      await new Promise(r => setTimeout(r, wait));
      continue;
    }
    return res;
  }
  throw new Error("Max retries exceeded");
}

Best practices

  • Check X-RateLimit-Remaining before sending bursts of requests.
  • Use separate keys for different workloads to isolate rate limits.
  • Implement exponential backoff with jitter for retries.
  • Set daily budgets on keys used by automated systems to prevent runaway costs.