Skip to main content
When you send a request to Forii, it goes through two layers today: authentication and inference. A billing layer ships when paid tiers launch.

Request flow

Request flow through API Gateway, Inference Server, and Response Pipeline

Usage tracking

Every request returns token counts in the response body (usage.prompt_tokens, usage.completion_tokens, usage.total_tokens). The dashboard surfaces these counts per model so you can see your token usage against your Free Plan limits.

Data residency

All inference runs in Indian data centers (Delhi NCR). No data leaves India — your prompts, completions, and embeddings stay under Indian jurisdiction, with no US CLOUD Act exposure.

Roadmap — Billing

When paid tiers ship, each request will deduct credits at model pricing and the third layer (billing) becomes active. Credits, UPI/card payments, auto-recharge, and GST invoices are planned, not live.