Skip to main content
This endpoint is not yet available. It is planned for a future release.
India’s 800M+ voice-first internet users need STT and TTS in their language. Forii will offer both, powered by Indic-optimized models.

Speech-to-Text (STT)

POST https://api.forii.in/inference/v1/audio/transcriptions
OpenAI-compatible format. Supports 22+ Indic languages and 8kHz telephony audio.

Planned parameters

ParameterTypeDescription
filefileAudio file (MP3, WAV, FLAC, OGG)
modelstringforii/saarika-v2
languagestringLanguage code (e.g. hi, ta, bn)
response_formatstringjson, verbose_json, text, srt, vtt
timestamp_granularities[]arrayword, segment

Example (planned)

curl https://api.forii.in/inference/v1/audio/transcriptions \
  -H "Authorization: Bearer $FORII_API_KEY" \
  -F file="@recording_hindi.mp3" \
  -F model="forii/saarika-v2" \
  -F language="hi" \
  -F response_format="verbose_json" \
  -F timestamp_granularities[]="word"

Text-to-Speech (TTS)

POST https://api.forii.in/inference/v1/audio/speech

Planned parameters

ParameterTypeDescription
modelstringforii/bulbul-v2
inputstringText to synthesize
voicestringVoice name (e.g. amrita, pratik)
response_formatstringmp3, opus, aac, flac, wav, pcm
speedfloat0.25 to 4.0

Example (planned)

response = client.audio.speech.create(
    model="forii/bulbul-v2",
    voice="amrita",
    input="नमस्ते, आपका खाता शेष रुपये पाँच हजार है।",
)
response.stream_to_file("output_hindi.mp3")

India differentiation

This is Forii’s biggest differentiator vs Fireworks and OpenAI. Neither offers production-grade STT/TTS for Indic languages.
  • Saarika v2: STT for 22+ Indian languages, 8kHz telephony audio optimization
  • Bulbul v2: TTS with 30+ voices across Hindi, Tamil, Telugu, Bengali, and more
  • Voice-first India: IVR systems, WhatsApp bots, accessibility features