Architecture - Forii — India's Sovereign Inference Platform

When you send a request to Forii, it goes through two layers today: authentication and inference. A billing layer ships when paid tiers launch.

Request flow

Usage tracking

Every request returns token counts in the response body (usage.prompt_tokens, usage.completion_tokens, usage.total_tokens). The dashboard surfaces these counts per model so you can see your token usage against your Free Plan limits.

Data residency

All inference runs in Indian data centers (Delhi NCR). No data leaves India — your prompts, completions, and embeddings stay under Indian jurisdiction, with no US CLOUD Act exposure.

Roadmap — Billing

When paid tiers ship, each request will deduct credits at model pricing and the third layer (billing) becomes active. Credits, UPI/card payments, auto-recharge, and GST invoices are planned, not live.

Authentication — How API keys work
Pricing — Free Plan limits and roadmap
Errors & Rate Limits — Rate limit details

​Request flow

​Usage tracking

​Data residency

​Roadmap — Billing

​Related

Request flow

Usage tracking

Data residency

Roadmap — Billing

Related