Skip to main content
Forii is an OpenAI-compatible inference API that lets you run any frontier model on Indian infrastructure — independent of US cloud providers. 30% lower cost, full data sovereignty, INR pricing.

How it works

Forii request flow: your app → API gateway → inference → response pipeline

Available now

CapabilityEndpointDescription
Chat completionsPOST /inference/v1/chat/completionsBasic, streaming, structured outputs, function calling
EmbeddingsPOST /inference/v1/embeddingsSemantic search over Hindi and English documents
ModelsGET /inference/v1/modelsList available models
Account & usageGET/POST /v1/accounts/{id}/...API keys, balance, usage queries

Coming soon

CapabilityDescription
Speech-to-text22 Indic languages, 8kHz telephony
Text-to-speech30+ voices across Hindi, Tamil, Telugu, Bengali
VisionImage understanding — Aadhaar, PAN, invoice OCR
RerankingCross-encoder reranking for RAG
Batch inferenceAsynchronous large-scale processing
Fine-tuningSFT on Indic-language data
DeploymentsDedicated GPU with autoscaling

OpenAI compatibility

Forii matches the OpenAI API format exactly. Every existing tutorial, framework, and SDK works by changing two lines:
from openai import OpenAI

client = OpenAI(
    base_url="https://api.forii.in/inference/v1",  # ← change this
    api_key=os.environ["FORII_API_KEY"],            # ← and this
)

# Everything else is identical to OpenAI
response = client.chat.completions.create(
    model="forii/deepseek-v3",
    messages=[{"role": "user", "content": "Hello"}],
)

Framework compatibility

FrameworkWorks?How
OpenAI Python SDKYesbase_url + api_key
OpenAI JavaScript SDKYesbaseURL + apiKey
LangChainYesChatOpenAI with Forii base_url
LlamaIndexYesOpenAI with Forii base_url
Vercel AI SDKYescreateOpenAI with Forii baseURL
LiteLLMYesAdd Forii as a custom provider

India-first design

  • Frontier models, lower cost — DeepSeek-V3, LLaMA-4-Scout, Gemma-3, Qwen3 at 30% below self-deployment cost.
  • Data sovereignty — Indian data centers, Indian jurisdiction. No US routing, no CLOUD Act exposure, no foreign subpoenas.
  • Indian data centers — Requests served from Delhi NCR. ~20-50ms TTFT vs 200-400ms from US providers.
  • INR pricing — All costs in rupees. No USD invoices, no FX conversion.
  • Free Plan today, paid tiers coming — Start free with no card. UPI payments, GST invoices, and higher limits ship with paid tiers.

Next steps