> ## Documentation Index
> Fetch the complete documentation index at: https://docs.forii.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# FAQ

> Frequently asked questions about Forii

## General

### What is Forii?

Forii is India's Sovereign Inference Platform. Run any frontier model — DeepSeek-V3, LLaMA-4-Scout, Gemma-3, Qwen3 — on Indian infrastructure with full data sovereignty, independent of US providers. 30% lower cost, INR pricing.

### How is Forii different from OpenAI?

Three things: (1) Frontier models at 30% lower cost — DeepSeek-V3, LLaMA-4, Gemma-3, Qwen3, (2) Data sovereignty — Indian jurisdiction, no US CLOUD Act, no data routed through American servers, (3) INR pricing. You switch by changing `base_url` and `api_key`.

### How is Forii different from Fireworks?

Fireworks is US-based with USD pricing. Forii offers the same frontier models on Indian infrastructure with data sovereignty — no US CLOUD Act exposure. INR pricing, Razorpay payments. STT/TTS for 22 Indic languages coming soon.

### Is my data processed in India?

Yes. All inference runs in Indian data centers (Delhi NCR). No data leaves India, no US jurisdiction applies.

## API & Compatibility

### Do I need a Forii SDK?

No. Forii is OpenAI-compatible. Use the official OpenAI SDK and change two lines: `base_url` and `api_key`.

### Which frameworks work with Forii?

LangChain, LlamaIndex, Vercel AI SDK, LiteLLM — any framework that supports OpenAI's API format. See [Overview → Framework compatibility](/docs/concepts/overview#framework-compatibility).

### Does Forii support streaming?

Yes. Set `stream=True` in your request. Works identically to OpenAI's SSE streaming.

### Does Forii support structured outputs?

Yes. Use `response_format` with `json_object` or `json_schema`. See [Chat Completions → Structured outputs](/docs/api-reference/chat-completions#structured-outputs).

### Does Forii support function calling?

Yes. Use the `tools` parameter. See [Chat Completions → Function calling](/docs/api-reference/chat-completions#function-calling).

## Pricing & Plans

### How does pricing work?

Forii is on the **Free Plan** today. No credits, no payment — just sign up and start building. Per-minute rate limits apply (60 RPM, 100K prompt TPM, 10K completion TPM). See [Pricing](/docs/concepts/pricing) for model rates and roadmap tiers.

### What payment methods do you accept?

None yet. Paid tiers and payments (UPI, cards, netbanking, wallets via Razorpay) are on the roadmap.

### Can I get GST invoices?

Not yet. GST-compliant invoices ship with paid tiers. Today there is nothing to invoice.

### What happens if I hit a limit?

Per-minute limits reset automatically at the start of the next minute (UTC). Wait for the reset, or spread your load more evenly. If you consistently need more, paid tiers are coming.

## Models

### Which models are available?

See [Models](/docs/concepts/models) for the full catalog and pricing.

### Why are models quantized?

AWQ 4-bit quantization reduces memory usage by \~75% while keeping quality within 1–2% of FP16. This enables multi-model packing on a single GPU, which is how Forii achieves 30% lower COGS.

### Are quantized models less capable?

No. Every model passes evaluation benchmarks (MMLU, HumanEval, GSM8K, HellaSwag) before deployment. If quality regresses beyond the threshold, the model is rejected.

### What about Hindi quality?

Every model is also evaluated on MMMU-Hindi. If quantization degrades Hindi quality beyond 5%, the variant is rejected. See [Models → Quantization](/docs/concepts/models#quantization).

## Rate Limits

### What are the rate limits?

The Free Plan gives you 60 RPM, 100K prompt TPM, and 10K completion TPM. See [Authentication → Rate limits](/docs/concepts/authentication#rate-limits).

### What happens when I hit a rate limit?

You receive a `429 Too Many Requests` response with a `Retry-After` header. The OpenAI SDK retries automatically. Per-minute counters reset at the start of the next minute (UTC).

### Can I increase my rate limits?

Not yet. Paid tiers with higher limits (Starter 600 RPM, Pro 6,000 RPM, Enterprise custom) are on the roadmap.