> ## Documentation Index
> Fetch the complete documentation index at: https://docs.forii.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Architecture

> How Forii processes your requests — auth, routing, and usage tracking

When you send a request to Forii, it goes through two layers today: authentication and inference. A billing layer ships when paid tiers launch.

## Request flow

<img src="https://mintcdn.com/forii-docs/cNsW6DAG7qkXCpMH/docs/images/architecture-flow.svg?fit=max&auto=format&n=cNsW6DAG7qkXCpMH&q=85&s=fc3689b06c7f3066f51fc7825bf4a844" alt="Request flow through API Gateway, Inference Server, and Response Pipeline" width="820" height="520" data-path="docs/images/architecture-flow.svg" />

## Usage tracking

Every request returns token counts in the response body (`usage.prompt_tokens`, `usage.completion_tokens`, `usage.total_tokens`). The dashboard surfaces these counts per model so you can see your token usage against your Free Plan limits.

## Data residency

All inference runs in Indian data centers (Delhi NCR). No data leaves India — your prompts, completions, and embeddings stay under Indian jurisdiction, with no US CLOUD Act exposure.

## Roadmap — Billing

When paid tiers ship, each request will deduct credits at model pricing and the third layer (billing) becomes active. Credits, UPI/card payments, auto-recharge, and GST invoices are planned, not live.

## Related

* [Authentication](/docs/concepts/authentication) — How API keys work
* [Pricing](/docs/concepts/pricing) — Free Plan limits and roadmap
* [Errors & Rate Limits](/docs/api-reference/errors) — Rate limit details
