Rate Limiting

Request limits, 429 response handling, and retry strategies for the Qamera AI API.

The Qamera AI API enforces rate limits to ensure fair usage and platform stability. Limits are applied per API key.

Default Limits

Limit	Value
Requests per minute	60
Window type	Sliding window

Each API key is allowed up to 60 requests per minute. This limit applies across all endpoints — both GET and POST requests count toward the same quota.

Implementation

Rate limiting uses an in-memory sliding window counter. The window tracks requests over the most recent 60-second period and rejects new requests once the limit is reached.

Because the counter is in-memory, it resets if the API server restarts. Do not rely on this behavior — future releases will use Redis-based persistent rate limiting.

429 Response

When you exceed the rate limit, the API returns a 429 Too Many Requests response:

{
  "error": "Too many requests. Please retry after a short delay."
}

Retry Strategies

Exponential Backoff (Recommended)

When you receive a 429 response, wait before retrying. Increase the delay with each consecutive failure:

Attempt 1: wait 1 second
Attempt 2: wait 2 seconds
Attempt 3: wait 4 seconds
Attempt 4: wait 8 seconds

Add a small random jitter (0–500ms) to each delay to avoid synchronized retries from multiple clients.

Simple Delay

For simpler implementations, wait a fixed 2 seconds after any 429 response before retrying.

Request Spreading

If your workload involves bursts of requests, spread them evenly across the minute window. For 60 requests per minute, aim for roughly one request per second.

Example: Retry with Backoff

#!/bin/bash
MAX_RETRIES=4
DELAY=1

for i in $(seq 1 $MAX_RETRIES); do
  RESPONSE=$(curl -s -o /dev/null -w "%{http_code}" \
    -H "X-Api-Key: mk_live_abc123.secretvalue" \
    https://app.qamera.ai/api/external/products)

  if [ "$RESPONSE" -ne 429 ]; then
    echo "Request succeeded with status $RESPONSE"
    break
  fi

  echo "Rate limited. Retrying in ${DELAY}s..."
  sleep $DELAY
  DELAY=$((DELAY * 2))
done

Future Changes

Persistent Redis-based rate limiting is planned. This will ensure limits survive server restarts and are enforced consistently across multiple server instances. The default limit of 60 requests per minute will remain the same.