Vision - Forii — India's Sovereign Inference Platform

This endpoint is not yet available. It is planned for a future release.

Understand images alongside text — document digitization, ID card reading, invoice OCR. Powered by models like Qwen-VL.

Planned usage

response = client.chat.completions.create(
    model="forii/qwen2.5-vl-72b",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "Extract all details from this Aadhaar card"},
            {"type": "image_url", "image_url": {
                "url": "data:image/jpeg;base64,/9j/4AAQ...",
                "detail": "high"
            }}
        ]
    }],
)

India use cases

Aadhaar / PAN card extraction — Parse ID documents into structured data
GST invoice parsing — Extract line items, totals, GSTIN from invoices
Handwritten form digitization — Convert Hindi handwritten forms to structured JSON
Screenshot understanding — Debug UI issues from screenshots

Chat Completions — Text-only inference
Structured Outputs — Guaranteed JSON from vision responses

Speech-to-Text & Text-to-Speech Reranking