Embeddings - Forii — India's Sovereign Inference Platform

Generate vector embeddings for text. Power RAG pipelines, semantic search, and document retrieval from Indian data centers.

POST https://api.forii.in/inference/v1/embeddings

Request

Python
JavaScript
cURL

response = client.embeddings.create(
    model="forii/embed-v3",
    input=[
        "What is the GST rate for textiles?",
        "भारत में टेक्सटाइल का GST दर क्या है?"
    ]
)

print(len(response.data[0].embedding))  # 1024
print(response.usage)  # {"prompt_tokens": 20, "total_tokens": 20}

const response = await client.embeddings.create({
  model: "forii/embed-v3",
  input: [
    "What is the GST rate for textiles?",
    "भारत में टेक्सटाइल का GST दर क्या है?",
  ],
});

console.log(response.data[0].embedding.length); // 1024
console.log(response.usage); // { prompt_tokens: 20, total_tokens: 20 }

curl https://api.forii.in/inference/v1/embeddings \
  -H "Authorization: Bearer $FORII_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "forii/embed-v3",
    "input": [
      "What is the GST rate for textiles?",
      "भारत में टेक्सटाइल का GST दर क्या है?"
    ]
  }'

Response

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [0.0023, -0.0091, 0.0147, ...]
    },
    {
      "object": "embedding",
      "index": 1,
      "embedding": [0.0041, -0.0078, 0.0082, ...]
    }
  ],
  "model": "forii/embed-v3",
  "usage": {
    "prompt_tokens": 20,
    "total_tokens": 20
  }
}

Parameters

Parameter	Type	Required	Default	Description
`model`	string	Yes	—	`forii/embed-v3`
`input`	string\|array	Yes	—	Text or array of texts to embed
`dimensions`	integer	No	1024	Output dimensions (variable-length embeddings)
`encoding_format`	string	No	`float`	`float` or `base64`

Pricing

Model	₹/1K tokens
`forii/embed-v3`	₹0.003

Embedding costs are extremely low. 1 million tokens costs ₹3. This makes bulk document indexing affordable for RAG pipelines.

Use cases

RAG (Retrieval-Augmented Generation) — Index documents in Hindi and English, retrieve relevant context for chat completions
Semantic search — Find similar documents across languages
Clustering — Group similar content by embedding distance
Deduplication — Detect near-duplicate documents

Chat Completions — Generate text from embeddings
Models — Browse available models

​Request

​Response

​Parameters

​Pricing

​Use cases

​Related

Request

Response

Parameters

Pricing

Use cases

Related