> ## Documentation Index
> Fetch the complete documentation index at: https://docs.forii.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Batch Inference

> Large-scale offline processing — Coming Soon

<Warning>
  This endpoint is not yet available. It is planned for a future release.
</Warning>

Process large volumes of requests asynchronously — benchmarking, document backlogs, bulk embeddings. Matches OpenAI's batch API format.

## Planned endpoints

* `POST /inference/v1/batch` — Create a batch job
* `GET /inference/v1/batch/{id}` — Check batch status and retrieve results

## How it will work

1. Upload a JSONL file where each line is a request: `{"custom_id": "...", "body": {...}}`
2. Forii processes all requests asynchronously
3. Retrieve results when the batch completes

<Info>
  Batch pricing will be 50% lower than real-time inference, matching industry standard.
</Info>

## India use cases

* **Document digitization backlogs** — Process millions of Hindi/regional-language documents
* **Bulk embeddings** — Generate embeddings for existing document collections
* **Model evaluation** — Benchmark models on custom datasets

## Related

* [Chat Completions](/docs/api-reference/chat-completions) — Real-time inference
* [Roadmap](/docs/support/roadmap) — feature timeline