Skip to main content

Error format

All errors follow the OpenAI format:
{
  "error": {
    "message": "Model 'forii/invalid-model' not found",
    "type": "invalid_request_error",
    "param": "model",
    "code": "model_not_found"
  }
}

Error codes

CodeMeaningRetry?Notes
400Bad request (bad parameters, missing fields)NoCheck your request body
401Invalid or missing API keyNoVerify your FORII_API_KEY
402Payment required (quota exceeded)NoFree Plan limit reached — wait for the next reset
404Model not foundNoCheck model name in Models
429Rate limit exceededYes (backoff)Includes Retry-After header
500Internal server errorYesRetry with exponential backoff
503Model temporarily unavailableYesRetry after brief wait
The OpenAI SDK has built-in retry for 429, 500, and 503 errors. If you’re using the SDK, retries happen automatically.

Rate limits

Request-level limits

PlanRPMTPM (prompt)TPM (completion)
Free60100K10K
Starter6001M100K
Pro6,00010M1M

Rate limit headers

Every response includes rate limit headers:
X-Ratelimit-Limit-Requests: 600
X-Ratelimit-Remaining-Requests: 543
X-Ratelimit-Reset: 1705312200
HeaderDescription
X-Ratelimit-Limit-RequestsMaximum RPM for your plan
X-Ratelimit-Remaining-RequestsRequests remaining in current window
X-Ratelimit-ResetUnix timestamp when the window resets

Handling 429 errors

import time

try:
    response = client.chat.completions.create(
        model="forii/deepseek-v3",
        messages=[{"role": "user", "content": "Hello"}],
    )
except openai.RateLimitError as e:
    retry_after = int(e.response.headers.get("Retry-After", 5))
    time.sleep(retry_after)
    # Retry the request

Response headers (coming soon)

These response headers are not yet available. They are planned for a future release.
HeaderDescription
x-forii-prompt-tokensToken count verification without parsing body
x-forii-completion-tokensToken count verification without parsing body
x-forii-cached-tokensCache hit visibility (when prompt caching ships)
x-forii-ttft-msTime to first token — key latency metric
x-forii-total-msTotal request time
x-forii-modelActual model served (resolves aliases)
x-forii-request-idFor debugging — trace individual requests
x-forii-regionWhich data center served the request
x-forii-region is an India-specific addition. Forii runs in India under Indian jurisdiction — developers need to verify their requests are served locally, not routed to US servers subject to the CLOUD Act.