This endpoint is not yet available. It is planned for a future release.
India’s 800M+ voice-first internet users need STT and TTS in their language. Forii will offer both, powered by Indic-optimized models.
Speech-to-Text (STT)
POST https://api.forii.in/inference/v1/audio/transcriptions
OpenAI-compatible format. Supports 22+ Indic languages and 8kHz telephony audio.
Planned parameters
| Parameter | Type | Description |
|---|
file | file | Audio file (MP3, WAV, FLAC, OGG) |
model | string | forii/saarika-v2 |
language | string | Language code (e.g. hi, ta, bn) |
response_format | string | json, verbose_json, text, srt, vtt |
timestamp_granularities[] | array | word, segment |
Example (planned)
curl https://api.forii.in/inference/v1/audio/transcriptions \
-H "Authorization: Bearer $FORII_API_KEY" \
-F file="@recording_hindi.mp3" \
-F model="forii/saarika-v2" \
-F language="hi" \
-F response_format="verbose_json" \
-F timestamp_granularities[]="word"
Text-to-Speech (TTS)
POST https://api.forii.in/inference/v1/audio/speech
Planned parameters
| Parameter | Type | Description |
|---|
model | string | forii/bulbul-v2 |
input | string | Text to synthesize |
voice | string | Voice name (e.g. amrita, pratik) |
response_format | string | mp3, opus, aac, flac, wav, pcm |
speed | float | 0.25 to 4.0 |
Example (planned)
response = client.audio.speech.create(
model="forii/bulbul-v2",
voice="amrita",
input="नमस्ते, आपका खाता शेष रुपये पाँच हजार है।",
)
response.stream_to_file("output_hindi.mp3")
India differentiation
This is Forii’s biggest differentiator vs Fireworks and OpenAI. Neither offers production-grade STT/TTS for Indic languages.
- Saarika v2: STT for 22+ Indian languages, 8kHz telephony audio optimization
- Bulbul v2: TTS with 30+ voices across Hindi, Tamil, Telugu, Bengali, and more
- Voice-first India: IVR systems, WhatsApp bots, accessibility features